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POLYKETIDE- ASSOCIATED SUGAR BIOSYNTHESIS GENES 

This application claims the benefit of U.S. Serial No. 08/576,626 filed December 2 1 , 
1995, now pending. 

5 

Field of the Invention 
The present invention relates to methods for directing the biosynthesis of specific 
polyketide analogs by genetic manipulation. In particular, sugar biosynthesis genes are 
manipulated to produce precise, novel glycosylation-modified macrolides of predicted 
10 structure. 

Background of the Invention 
Polyketides are a large class of natural products that includes many important 
antibiotic, antifungal, anticancer, and anti-helminthic compounds such as erythromycins, 

15 amphotericins, daunorubicins, and avermectins. Their synthesis proceeds by an ordered 
condensation of acyl esters to generate carbon chains of varying length, side chain, and 
reduction pattern that are differentially cyclized and subsequently modified to give the mature 
polyketides. For many polyketides, maturation includes the addition of one or more sugar 
residues to the cyclized carbon chain. The sugar residues are frequently critical to the 

20 biological activity of the mature polyketide. 

Streptomyces and the closely related Saccharopolyspora genera are prodigious 
producers of polyketide metabolites. Because of the commercial significance of these 
compounds, a great amount of effort has been expended in the study of Streptomyces 
genetics. Consequently, much is known about Streptomyces and several cloning vectors exist 

25 for introducing DNA into these organisms. 

Although many polyketides have been identified, there remains the need to obtain 
novel glycosylation modified (as defined herein) polyketide structures with enhanced 
properties. Current methods of obtaining such molecules include screening of biological 
samples and chemical modification of existing polyketides, both of which are costly and time 

30 consuming. Current screening methods are based on gross properties of the molecule, i.e. 
antibacterial, antifungal activity, etc., and both a priori knowledge of the structure of the 
molecules obtained or predetermination of enhanced properties are virtually impossible. 
Standard chemical modification of existing structures has been successfully employed, but is 
limited by the number of types of compounds obtainable. Furthermore, the poor yield of 

35 multistep chemical syntheses often limits the practicality of this approach. The following 
modifications to sugar residues bound to polyketides are particularly difficult or inefficient at 
the present time: change the stereochemistry of specific hydroxyl or methyl groups, change 
the oxidation state of specific hydroxyl groups, and deoxygenation of specific carbons. 
Accordingly, there exists a need to obtain molecules wherein such changes are specified and 
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performed which would represent an improvement in the technology to produce altered 
glycosylation-modified polyketide molecules with predicted structure. 

The present invention overcomes these problems by providing the genetic sequence of 
sugar biosynthesis genes involved in the biosynthesis of polyketide-associated sugars. 

Summary pf tfa faVffltjpn 
In one aspect, the present invention provides an isolated single or double stranded 
polynucleotide, typically DNA, having a nucleotide sequence which comprises (a) a 
nucleotide sequence selected from the group consisting of (i) the sense sequence of FIG. 4A 
(SEQ ID NO: 1) from about nucleotide position 54 to about nucleotide position 1 136; (ii) the 
sense sequence of SEQ ID NO: 1 from about nucleotide position 1 147 to about nucleotide 
position 2412; (iii) the sense sequence of SEQ ID NO:l from about nucleotide position 2409 
to about nucleotide position 3410 ; (iv) the sense sequence of FIG. 4B (SEQ ID NO:2) from 
about nucleotide position 80 to about nucleotide position 1048; (v) the sense sequence of 
SEQ ID NO:2 from about nucleotide position 1048 to about nucleotide position 2295; (vi) the 
sense sequence of SEQ ID NO:2 from about nucleotide position 2348 to about nucleotide 
position 3061 ; (vii) the sense sequence of SEQ ID NO:2 from about nucleotide position 3214 
to about nucleotide position 4677; (viii) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 4674 to about nucleotide position 5879; (ix) the sense sequence of SEQ 
ID NO:2 from about nucleotide position 5917 to about nucleotide position 7386; and (x) the 
sense sequence of SEQ ID NO:2 from about nucleotide position 7415 to about nucleotide 
position 7996; (b) sequences complementary to the sequences of (a); (c) sequences that, on 
expression, encode a polypeptide encoded by the sequences of (a); and (d) analogous 
sequences that hybridize under stringent conditions to the sequences of (a) and (b). A 
preferred molecule is a DNA molecule. In another embodiment, the polynucleotide is an 
RNA molecule. 

In another embodiment, a DNA molecule of the present invention is contained in an 
expression vector. The expression vector preferably further comprises an enhancer-promoter 
operatively linked to the polynucleotide. In a preferred embodiment, the DNA molecule in 
the vector is one of the preferred sequences mentioned above. In an especially preferred 
embodiment, the DNA molecule in the vector is the sequence of SEQ ID NO:2 from about 
nucleotide position 80 to about nucleotide position 1048. 

The present invention still further provides for a host cell transformed with a 
polynucleotide or expression vector of this invention. Preferably, the host cell is a bacterial 
cell selected from the group consisting of Saccharopolyspora spp., Streptomyces spp. and E. 
coli. i 

The present invention also provides methods to produce novel glycosylati|>n modified 
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polyketide structures by designing and introducing specified changes in the DNA governing 
the synthesis and attachment of sugar residues to polyketides. According to one method, the 
biosynthesis of specific glycosylation-modified polyketides is accomplished by genetic 
manipulation of a polyketide-producing microorganism comprising the steps of isolating a 
sugar biosynthesis gene-containing DNA sequence from those described above; identifying 
within the gene-containing DNA sequence one or more DNA fragments responsible for the 
biosynthesis of a polyketide-associated sugar or its attachment to the polyketide; creating one 
or more specified changes into the DNA fragment or fragments, thereby resulting in an 
altered DNA sequence; introducing the altered DNA sequence into a polyketide-producing 
microorganism to replace the original sequence whereby the altered DNA sequence, when 
translated, results in altered enzymatic activity capable of effecting the production of the 
specific glycosylation-modified polyketide; growing a culture of the altered polyketide- 
producing microorganism under conditions suitable for the formation of the specific 
glycosylation-modified polyketide; and isolating said specific glycosylation-modified 
polyketide from the culture. 

In a second method the biosynthesis of specific glycosylation-modified polyketides is 
accomplished by isolating a sugar biosynthesis gene-containing DNA sequence from from 
those described above; identifying within the gene-containing DNA sequence one or more 
DNA fragments responsible for the biosynthesis of a polyketide-associated sugar or its 
attachment to the polyketide; reversing the strand orientation of the DNA fragment or 
fragments, thereby resulting in an altered DNA sequence which, when transcribed, results in 
production of an antisense mRNA; introducing the altered DNA sequence into a polyketide- 
producing microorganism having an mRNA capable of binding to the antisense mRNA which 
results in altered enzymatic activity capable of effecting the production of the specific 
glycosylation-modified polyketide; growing a culture of the altered polyketide-producing 
microorganism under conditions suitable for the formation of the specific glycosylation- 
modified polyketide; and isolating the specific glycosylation-modified polyketide from the 
culture. 

In a third method the biosynthesis of specific glycosylation-modified polyketides is 
accomplished by isolating a sugar biosynthesis gene-containing DNA sequence from from 
those described above; identifying within the gene-containing DNA sequence one or more 
DNA fragments responsible for the biosynthesis of a polyketide-associated sugar or its 
attachment to the polyketide; introducing the DNA fragment or fragments into a polyketide- 
producing microorganism whereupon transcription and translation of the DNA fragment or 
fragments generate an altered polyketide-producing microorganism that is capable of 
producing the specific glycosylation-modified polyketide; growing a culture of the 
polyketide-producing microorganism containing the DNA fragment or fragments under 
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conditions suitable for the formation of the specific glycosylation-modified polyketide; and 
isolating the specific glycosylation-modified polyketide from the culture. 

Preferably, the sugar biosynthesis gene-containing DNA sequence of the processes 
described above comprises genes which encode an enzymatic activity involved in the 
5 biosynthesis of L-mycarose and/or D-desosamine. More preferably, the sugar biosynthesis 
gene-containing DNA sequence comprises the sequence of SEQ ID NO:2 from about 
nucleotide position 80 to about nucleotide position 1048. 

The present invention is especially useful in manipulating sugar biosynthesis genes 
from Streptomyces and Saccharopolyspora* organisms that provide over one-half of the 
io clinically useful antibiotics. 

Brief Description of the Drawings 
FIG. 1A illustrates the organization of the erythromycin biosynthetic gene cluster and 
the genetic designations of the biosynthetic genes; FIG. IB illustrates an abbreviated 
15 erythromycin biosynthetic scheme that broadly associates the biosynthetic genes with their 
role in erythromycin biosynthesis. Seven eryB genes, eryBI - eryBVII, are responsible for the 
biosynthesis of L-mycarose or its attachment to the erythronolide B ring, and six eryC genes, 
eryCI - eryCVI, are responsible for the biosynthesis of D-desosamine or its attachment to 3- 
a-mycarosylerythronolide B. The dashed arrows indicate that the pathway through 
20 erythromycin B is not the principal natural biosynthetic route to erythromycin A. 

FIG. 2 illustrates the proposed scheme for the biosynthesis of L-mycarose and the 
eryB genes responsible for the specific steps. 

25 FIG. 3 illustrates the proposed scheme for the biosynthesis of D-desosamine and the 

eryC genes responsible for the specific steps. 

FIG. 4A( 1-4) illustrates the nucleotide sequence (SEQ ID NO: 1) of the sugar 
biosynthesis genes eryCII (coordinates 54-1 136), eryCIII (coordinates 1 147-2412), and 
30 eryBJI (coordinates 2409-34 10), with corresponding translation of the open reading frames 
(SEQ ID NO:3, SEQ ID NO:4 and SEQ ID NO:5 respectively). Standard one letter codes for 
the amino acids appear beneath their respective nucleic acid codons as described herein. 

FIG. 4B( 1-9) illustrates the nucleotide sequence (SEQ ID NO:2) of the sugar 
35 biosynthesis genes eryBIV (coordinates 80-1048), eryBV (coordinates 1048-2295), eryCVI 
(coordinates 2348-3061), eryB V7t(coordinates 3214-4677), eryOV (coordinates 4674-5879), 
eryCV (coordinates 5917-7386)^and eryBVU (coordinates 7415-7996) with corresponding 
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translation of the putative open reading frames (SEQ ID NO:6, SEQ ID NO:7, SEQ ID 
NO:8, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO: 1 1 and SEQ ID NO: 12 respectively). 
Standard one letter codes for the amino acids appear beneath their respective nucleic acid 
codons as described herein. 

5 

FIG. 5A illustrates the amino acid sequence identity between the sugar biosynthesis 
enzyme encoded by the eryBIV gene of Sac. erythraea (SEQ ID NO:6) and the sugar 
biosynthesis enzymes encoded by the ascF gene of Yersinia pseudotuberculosis [Thorson et 
al, / Bacteriol, 176:5483 (1994)], (SEQ ID NO:13), the rfbJ gene of Salmonella enterica 

io [Jiang et al, Moi Microbiol , 5:695 (199 1)]. (SEQ ID NO: 14), the strL gene of Streptomyces 
griseus [Pissowotzki et al, Mol Gen. Genet. 241:193 (1993)] (SEQ ID NO: 15) and the galE 
gene of Escherichia coli [Lemaire and Hill, Nucl. Acids Res. 14:7705 (1986)] (SEQ ID 
NO: 16). In this and all other Figures in which amino acid sequence identity is compared 
capitalized letters represent consensus (identical) amino acids between species or amino acids 

15 which are conservative substitutions for the consensus residues. Also in each Figure, the 
sequence identified as "consensus" is merely a convenient representation of conserved amino 
acids and is not intended as a representation of any existing polypeptide sequence. 

FIG. 5B illustrates the amino acid sequence identity between the sugar biosynthesis 
20 enzyme encoded by the eryBVII gene of Sac. erythraea (SEQ ID NO: 12) and the sugar 

biosynthesis enzymes encoded by the strM gene of Streptomyces griseus [Pissowotzki et al, 
Mol Gen. Genet. 241 : 193 ( 1993)] (SEQ ID NO: 17), the rfbC gene of Salmonella enterica 
[Jiang et al, Mol Microbiol, 5:695 (1991)] (SEQ ID NO: 18), the rfbF gene of Yersinia 
entercolitica [Zhang et al, Mol Microbiol, 9:309 (1993)] (SEQ ID NO:19), and the ascE 
25 gene of Yersinia pseudotuberculosis [Thorson et al, J. Bacteriol, 176:5483 ( 1 994)] (SEQ ID 
NO:20). 

FIG. 5C illustrates the amino acid sequence identity between the sugar biosynthesis 
enzyme encoded by the eryC/Vgene of Sac. erythraea (SEQ ID NO:10) and the sugar 

30 biosynthesis enzymes encoded by the eryCl gene of Sac. erythraea [Dhillon et al, Mol. 

Microbiol, 3:1405 (1989)] (SEQ ID NO:21), the ascC gene of Yersinia pseudotuberculosis 
[Weigel etal, Biochemistry, 31:2129 (1992), Thorson etal,]. Am. Chem. Soc, 115:6993 
(1993), Thorson et al, J. Bacteriol, 176:5483 (1994)] (SEQ ID NO:22), the dnrJ gene of 
Streptomyces peucetius [Stutzman-Engwall et al, J. Bacteriol, 174: 144 (1992)] (SEQ ID 

35 NO:23), the prgl gene of Streptomyces alboniger [Lacalle et al, EMBO J. , 1 1 :785 ( 1 992)] 
(SEQ ID NO:24), and the strS gene of Streptomyces griseus [Distler et al, Gene, 1 15:105 
(1992)] (|EQIDNO:25). 
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FIG. 5D illustrates the amino acid sequence identity between the sugar biosynthesis 
enzymes encoded by the eryBV and eryCIII genes of Sac. erythraea (SEQ ED NO:7 and SEQ 
ID NO:4 respectively) and the sugar biosynthesis enzyme encoded by the dnrS gene of 
5 Streptomyces peucetius [Otten et ai, J. Bacteriol, 177:6688 (1995)] (SEQ ID NO:26). 

FIG. 5E illustrates the amino acid sequence identity between the sugar biosynthesis 
enzyme encoded by the eryCVI gene of Sac. erythraea (SEQ ID NO:8) and the sugar 
biosynthesis enzymes encoded by the srmX gene of Streptomyces ambofaciens [Geistlich et 
10 al, Mol Microbiol, 6:2019 (1992)] (SEQ ID NO:27), the rdmD gene of Streptomyces 
purpurascens [GenBank Accession: U10405] (SEQ ID NO:28) and the glycine 
methyltransferase of Rattus norvegious [Ogawa et al, Eur. J. Biochem. 168: 141 (1987)] 
(SEQ ID NO:29). 

1 5 FIG. 6A through 6D illustrate the compounds conceivably formed in Examples 1 -4 

respectively and are representative of compounds formed from Type I (FIG 6A), Type II 
(FIG. 6B), and Type 10 (FIGS. 6C and 6D) alterations. 

FIG. 7 illustrates the construction of the expression plasmid pASX2 described in 
20 Example 2. For FIGS 7-13 the following abbreviations have been used: amp, ampicillin 
resistance gene; tsr, thiostrepton resistance gene; ROP f repressor of plasmid synthesis gene; 
eryBI, eryBII, eryBIII, eryBW, eryBV, eryBVI, eryBVII, eryCI, eryCII, eryCIII, eryCIV, 
eryCV, and eryCVI, the erythromycin biosynthetic genes involved in the synthesis of 
mycarose or its attachment to the macrolide ring (eryB) or the synthesis of desosamine or its 
25 attachment to the macrolide ring (eryQ [the thin arrows above a gene indicate its relative size 
and the direction of transcription]; ori-£. coli, an origin of DNA replication that functions in 
£. coli, in the specific examples the ColEl origin; ori-Streptomyces, an origin of DNA 
replication that functions in Streptomyces, in the specific examples the pJVl origin [Servin- 
Gonzalezefa/., Microbiology, 141:2499(1995)]; p-era£* a modified promoter for the 
30 erythromycin resistance gene; t-fd, the gene VIII transcription terminator of bacteriophage fd; 
PCR, polymerase chain reaction. Restriction enzyme sites have been indicated by their 
standard commercial names (i.e. BamHl, EcdRl etc). The abbreviations appended to the 
large arrows in the plasmid synthetic schemes summarize each of the steps involved the 
plasmid constructions. These steps are described fully in the relevant Examples. 

35 

i FIG. 8 illustrates the construction of the eryBVII antisense expression plasmid t 

\ pASBVII described in Example 2. j 
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FIG. 9A illustrates the construction of the carrier plasmid pKl . 

FIG. 9B-E illustrates the construction of plasmid pKB6 which carries all of the eryB 
5 genes and is described in Example 3. 

FIG. 10 illustrates the construction of expression plasmid pXl described in Example 

3. 

io FIG. 1 1 illustrates the construction of the eryB expression plasmids pXSB6 and pXB6 

described in Example 3, 

FIG. 12A-B illustrate the construction of plasmid pKC4 which carries all of the eryC 
genes described in Example 4. 

15 

FIG. 13 illustrates the construction of the eryC expression plasmids pXSC4 and pXC4 
described in Example 4. 

Detailed Description of the Invention 

20 I. The Invention 

The present invention provides isolated and purified polynucleotides that encode 
enzymes or fragments thereof responsible for the biosynthesis of polyketide-associated sugars 
or their attachment to polyketides, vectors containing those polynucleotides, host cells 
transformed with those vectors, a process of making novel glycosylated polyketides using 

25 those polynucleotides and vectors, and isolated and purified recombinant polypeptides and 
polypeptide fragments thereof. 

II. Definitions 

For the purposes of the present invention as disclosed and claimed herein, the 
30 following terms are defined. 

The term "polyketide" as used herein refers to a large and diverse class of natural 
products, including but not limited to antibiotic, antifungal, anticancer, and anti-helminthic 
compounds. Antibiotics include, but are not limited to anthracyclines and macrolides of 
different types (polyenes and avennectins as well as classical macrolides such as 
35 erythromycins). 

The term "glycosylated polyketide" refers to any polyketide that contains one or more 
sugar residues. J 
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The term "glycosylation-modified polyketide" refers to a polyketide having a changed 
glycosylation pattern or configuration relative to that particular polyketide's unmodified or 
native state. 

The term "polyketide-producing microorganism" as used herein includes any 
microorganism that can produce a polyketide naturally or after being suitably engineered (i.e. 
genetically). Examples of actinomycetes and the polyketides they naturally produce include 
but are not limited to those listed in Table I below (see Hopwood, D.A. and Sherman, D.H., 
Annu. Rev. Genet., 24:37-66 (1990) incorporated herein by reference). 

Table 1 



Organism Polykctidc Produced 



Saccharopolyspora erythraea 


Erythromycin 


Streptomyces ambofacieris 


Spiramycin 


Streptomyces avermitilis 


Avcrmcctin 


Streptomyces fradiae 


Tyiosin 


Streptomyces g rise us 


Candicidin, monactin, griseusin 


Streptomyces violaceoniger 


Granaticin 


Streptomyces thermotolerans 


Carbomycin 


Streptomyces rmiosus 


Oxytetracycline 


Streptomyces peucetius 


Daunonibicin 


Streptomyces coelicolor 


Actinorhodin 


Streptomyces glaucescens 


Tctracenomycin 


Streptomyces roseofiilvus 


Frenolicin 


Streptomyces cinnamonensis 


Monensin 


Streptomyces curacoi 


Curamycin 


Amycolatopsis mediterranei 


Rifamycin 



Other examples of polyketide-producing microorganisms that produce polyketides 
naturally include various Actinomadura , Dactylosporangium and Nocardia strains. 

The term "sugar biosynthesis genes" as used herein refers to sequences of DNA from 
Saccharopolyspora erythraea that encode sugar biosynthesis enzymes and is intended to 
include sequences of DNA from other polyketide-producing microorganisms which are 
identical or analogous to those obtained from Saccharopolyspora erythraea. 

The term "sugar biosynthesis enzymes" as used herein refers to polypeptides which 
are involved in the biosynthesis and/or attachment of polyketide-associated sugars and their 
derivatives and intermediates. 1 
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The term "polyketide-associated sugar" refers to a sugar that is known to attach to 
polyketides or that can be attached to polyketides by the processes described herein. 

The term "sugar derivative" refers to a sugar which is naturally associated with a 
polyketide but which is altered relative to the unmodified or native state; examples only 
5 include N-3-a-desdimethyl D-desosamine, D-mycarose, 4-keto-L-mycarose, 4-keto-D- 
mycarose, 3-desmethyl L-mycarose and 3-desmethyl D-mycarose. 

The term "sugar intermediate" refers to an intermediate compound produced in a 
sugar biosynthesis pathway. 

The term "eryB" as used herein refers to sequences of DNA that encode enzymes 
10 involved specifically in the biosynthesis of the deoxysugar L-mycarose. 

The term "eryC" as used herein refers to sequences of DNA that encode enzymes 
involved specifically in the biosynthesis of the deoxysugar D-desosamine. 



III. Polynucleotides 

15 The organization of the segment of the Saccharopolyspora erythraea (Sac. erythraea) 

chromosome that determines the biosynthesis of erythromycin and the corresponding genes 
that determine the biosynthesis of the sugars L-mycarose and D-desosamine, designated 
eryB and eryC, respectively, are shown in FIG. 1 A. It is seen that several genes are required 
for the biosynthesis of each of the sugars and that these genes are interspersed among one 

20 another. It is predicted that each gene encodes an enzyme that catalyzes one or a few steps in 
the biosynthesis of L-mycarose or D-desosamine from thymidine diphospho-4-keto-6 
deoxyglucose (TDP-glucose); these steps are outlined in FIG. 2 and FIG. 3. In the case of L- 
mycarose, (shown in FIG. 2), these steps include: (1) C-2" deoxygenation , (2) C-27C-3" 
enoyl reduction, (3) C-5" epimerization, (4) C-3" C-methylation, (5) C-4" keto reduction, and 

25 (6) transfer to erythronolide B. For D-desosamine, shown in FIG. 3, these steps comprise (1) 
CA'fS isomerization, (2, 3) C-3' deoxygenation and reduction, (4) C-3 r animation, 
(5, 6) N-3a' N-dimethylation, and transfer to mycarosyl erythronolide B. 

This classification of genes (as belonging to either the eryB class or eryC class) was 
determined by first altering the wild type genes of interest in an erythromycin producing 

30 strain (i.e. in vivo) to inactivate their expression. The erythromycin products resulting from 
such alterations were then analyzed. Genes whose alterations caused an accumulation of 
erythronolide B (indicating a lack of L-mycarose, or failure to attach L-mycarose to the 
erythronolide ring) were classified as eryB genes; genes whose alterations caused an 
accumulation of 3-a-L-mycarosyI erythronolide B (indicating a lack of D-desosamine, or 

35 failure to attach D-desosamine to the 3-a-L-rnycarosyl erythronolide B ring) were classified 
as eryC genes. Accordingly, it should bp noted that all such genes identified herein as eryB 
or eryC are involved in the synthesis oflL-mycarose or D-desosamine. The predicted 



4 

4 



WO 97/23630 



PCT/US96/20238 



10 

functional activities of the polypeptides encoded by eryB and eryC will be discussed in 
further detail below. 

In one aspect then, the present invention provides isolated and purified eryB and eryC 
polynucleotides from Sac. erythraea that encode enzymes involved in the production of 
glycosylated polyketides. A polynucleotide of the present invention that encodes a sugar 
biosynthesis enzyme is an isolated single or double stranded polynucleotide having a 
nucleotide sequence which comprises (a) a nucleotide sequence selected from the group 
consisting of (i) the sense sequence of FIG. 4A (SEQ ID NO:l) from about nucleotide 
position 54 to about nucleotide position 1 136; (ii) the sense sequence of SEQ ID NO: 1 from 
about nucleotide position 1 147 to about nucleotide position 2412; (iii) the sense sequence of 
SEQ ID NO:l from about nucleotide position 2409 to about nucleotide position 3410 ; (iv) 
the sense sequence of FIG. 4B (SEQ ID NO:2) from about nucleotide position 80 to about 
nucleotide position 1048; (v) the sense sequence of SEQ ID NO:2 from about nucleotide 
position 1048 to about nucleotide position 2295; (vi) the sense sequence of SEQ ID NO:2 
from about nucleotide position 2348 to about nucleotide position 3061 ; (vii) the sense 
sequence of SEQ ID NO:2 from about nucleotide position 3214 to about nucleotide position 
4677; (viii) the sense sequence of SEQ ID NO:2 from about nucleotide position 4674 to 
about nucleotide position 5879; (ix) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 5917 to about nucleotide position 7386; and (x) the sense sequence of 
SEQ ID NO:2 from about nucleotide position 7415 to about nucleotide position 7996; 

(b) sequences complementary to the sequences of (a), 

(c) sequences that, when expressed, encode polypeptides encoded by the sequences of 
(a), and 

(d) analogous sequences that hybridize under stringent conditions to the sequences of 

(a). 

A preferred polynucleotide is a DNA molecule. In another embodiment, the polynucleotide 
is an RNA molecule. 

The nucleotide sequence and deduced amino acid residue sequences of the sugar 
biosynthesis genes are set forth in FIG. 4A(l-4) and FIG. 4B(l-9). The nucleotide sequences 
of FIG. 4A(M) (SEQ ID NO:l) and FIG. 4B(l-9) (SEQ ID NO:2) represent full length DNA 
clones of the sense strand of two distinct clusters of sugar biosynthesis genes and are 
intended to represent both the sense strand (shown on top) and its complement. The amino 
acid sequences depicted below the sense strand correspond to polypeptides encoded by a 
nucleotide sequence selected from the group consisting of (i) the sense strand of SEQ ID 
NO: 1 from about nucleotide position 54 to about nucleotide position 1 136 (ii) the sense 
sequence of SEQ ID NO: 1 from about nucleotide position 1 147 to about nucleotide position 
2412, (iii) the s^nse sequence of SEQ ID NO:l from about nucleotide position 2409 to about 
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nucleotide position 3410, (iv) the sense sequence of SEQ ID NO: 2 from about nucleotide 
position 80 to about nucleotide position 1048, (v) the sense sequence of SEQ ID NO:2 from 
about nucleotide position 1048 to about nucleotide position 2295, (vi) the sense sequence of 
SEQ ID NO:2 from about nucleotide position 2348 to about nucleotide position 3061 , (vii) 

5 the sense sequence of SEQ ID NO:2 from about nucleotide position 3214 to about nucleotide 
position 4677, (ix) the sense sequence of SEQ ID NO:2 from about nucleotide position 591 7 
to about nucleotide position 7386 and (x) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 7415 to about nucleotide position 7996. The polypeptides encoded by the 
nucleotide sequences of (i)-(x) above are set forth as SEQ ID NO:3-SEQ ID NO: 12 

10 respectively. 

The present invention also contemplates analogous DNA sequences which hybridize 
under stringent hybridization conditions to the DNA sequences set forth above. Stringent 
hybridization conditions are well known in the art and define a degree of sequence identity 
greater than about 80%-90%. The modifier "analogous" refers to those nucleotide sequences 

15 that encode analogous polypeptides (i.e. in relation to a sugar biosynthesis enzyme), 

analogous polypeptides being those which have only conservative differences and which 
retain the conventional characteristics and activities of sugar biosynthesis enzymes. (A more 
detailed description of analogous polypeptides is provided below). The present invention 
also contemplates naturally occurring allelic variations and mutations of the DNA sequences 

20 set forth above so long as those variations and mutations code, on expression, for a sugar 
biosynthesis gene of this invention as set forth hereinafter. 

As is well known in the art, because of the degeneracy of the genetic code, there are 
numerous other DNA and RNA molecules that can code for the same polypeptides as those 
encoded by the aforementioned sugar biosynthesis genes and fragments thereof. The present 

25 invention, therefore, contemplates those other DNA and RNA molecules which, on 

expression, encode the polypeptides of SEQ ID NO:3-SEQ ID NO:l 1 or fragments thereof. 
Having identified the amino acid residue sequence encoded by a sugar biosynthesis gene, and 
with knowledge of all triplet codons for each particular amino acid residue, it is possible to 
describe all such encoding RNA and DNA sequences. DNA and RNA molecules other than 

30 those specifically disclosed herein and, which molecules are characterized simply by a 
change in a codon for a particular amino acid, are within the scope of this invention. 

The 20 common amino acids and their representative abbreviations, symbols and 
codons are well known in the art (see for example, Molecular Biology of the Cell, Second 
Edition, B. Alberts et aL, Garland Publishing Inc., New York and London, 1989). As is also 

35 well known in the art, codons constitute triplet sequences of nucleotides in mRNA molecules 
and as such, are characterized by the base uracil (U) in place of base thymidine (T) which is 
present in DNA molecules. A simple change in a codon for the same amino acid residue 
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within a polynucleotide will not change the structure of the encoded polypeptide. By way of 
example, it can be seen from SEQ ID NO: 1 that an AGC codon for serine exists at nucleotide 
positions 126-128 and again at positions 420-422 and 561-563. However, it can also be seen 
from that same sequence that serine can be encoded by a TCG codon (see eg. nucleotide 

5 positions 192-194) and a TCC codon (see e.g., nucleotide positions 204-206). Substitution of 
the latter codons for serine with the AGC codon for serine, or visa versa, does not 
substantially alter the DN A sequence of SEQ ID NO: 1 and results in production of the same 
polypeptide. In a similar manner, substitutions of the recited codons with other equivalent 
codons can be made in a like manner without departing from the scope of the present 

10 invention. 

A polynucleotide of the present invention can also be an RN A molecule. An RNA 
molecule contemplated by the present invention is complementary to or hybridizes under 
stringent conditions to any of the DNA sequences set forth above. Exemplary and preferred 
RNA molecules are mRNA molecules that encode sugar biosynthesis enzymes of this 
15 invention. 

IV. Polypeptides 

In another aspect, the present invention provides polypeptides which are reasonably 
believed to be sugar biosynthesis enzymes. A sugar biosynthesis enzyme of the present 
20 invention is a polypeptide of about 21 kdal to about 47 kdal. As set forth in FIG. 5A-5E, 
analogs of the predicted polypeptides encoded by certain eryB and eryC genes have been 
identified in various species and their sequences compared using the PRETTY routine 
(Genetics Computer Group (GCG) Sequence Analysis Software Package, Madison, WI). 
Due to the degree of amino acid sequence identity existing between the polypeptides of these 
25 other sugar biosynthesis genes and the polypeptides encoded by the eryB and eryC genes, 
certain enzymatic activities can reasonably be attributed to the eryB and eryC polypeptides. 

By way of example, analogs of the polypeptide encoded by the eryBIV gene have 
been identified in Yersinia pseudotuberculosis Salmonella enterica, Streptomyces griseus and 
Escherichia coli (see FIG. 5A). The various analogs have been identified with from 290-328 
30 amino acid residues and are characterized by a low degree of amino acid sequence identity. 
(For example, the identity between the sugar biosynthesis enzyme encoded by the eryBIV 
gene of Sac. erythraea and the sugar biosynthesis enzyme encoded by the galE gene of E. 
coli is 20% at the amino acid level). However, a conserved amino acid sequence motif, G x x 
G x x G (where G represents the amino acid glycine and x represents any other amino acid 
35 residue) is found within the first 30 amino acid residues of all analogs shown. Since the 

polypeptide encoded by the galE gene has been shown to be an epimerase (whose mechanis/n 
includes a ketoreduction (Bauer et a/., Proteins 12:372 (1992)), the eryBIV gene product isj 
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reasonably predicted to be a ketorcductase. 

As set forth in FIG. 5B analogs of the sugar biosynthesis enzyme encoded by the 
eryBVII gene have been identified in Streptomyces griseus Salmonella enterica, Yersinia 
entercolitica and Yersinia pseudotuberculosis. The various analogs have been identified with 

5 from 183-200 amino acid residues and are characterized by a moderate degree of amino acid 
identity. By way of example, the identity at the amino acid level between the sugar 
biosynthesis enzyme encoded by the eryBVII gene of Sac. erythraea and the sugar 
biosynthesis enzyme encoded by the rfbC gene of Salmonella enterica or the strM gene of 
Streptomyces griseus is 37% and 61 %, respectively. Furthermore, a common characteristic 

10 of these particular polypeptides (including that of eryBVII), is that they are only associated 
with L-sugar biosynthesis and not with D-sugar biosynthesis. Thus the gene product of 
eryBVII is reasonably predicted to function as a C-5 epimerase which converts the 
stereochemistry of the sugar from the "D" configuration to the "L" configuration. 



15 eryCIV gene have been identified in Sac. erythraea and Yersinia pseudotuberculosis. As set 
forth in FIG. 5C, the predicted amino acid sequences of the protein products of eryCI and 
eryCIV share 34% sequence identity to each other, 27% and 25% respectively to the 
predicted amino acid sequence encoded by ascC from Yersinia pseudotuberculosis. The 
enzyme encoded by ascChas been shown to remove a hydroxy 1 group located at the C-3 

20 position of L-ascarylose (Liu and Thorson, Annu. Rev. Microbiol. 48:223 (1994)). Thus, at 
least one of the polypeptides encoded by eryCI or eryCIV is predicted to be an enzyme which 
functions in deoxygenation reactions. 

Furthermore, the enzyme encoded by the ascC gene requires the biochemical cofactor 
pyridoxamine, which is the same cofactor used in biochemical transamination reactions. 

25 Consequently, it has been proposed that some protein analogs (such as dnrJ from 

Streptomyces peucetius, prgl from Streptomyces alboniger and strs from Streptomyces 
griseus) having a moderate degree of sequence similarity to the polypeptide encoded by ascC 
function as transaminases in amino sugar biosynthesis (Thorson et al, J. Am. Chem. Soc. 
1 15:6993 (1993)). Since the biosynthesis of D-desosamine requires both deoxygenation and 

30 transamination, it is reasonable to predict that at least one of the polypeptides encoded by the 
eryCI or eryCIV genes functions in transamination reactions. 

As set forth in FIG. 5D the predicted polypeptides encoded by eryBV and eryCIII 
share 43% identity at the amino acid level and as such, may be assumed to have similar 
activities with respect to their particular sugars. However, as shown in FIGS. 2 and 3, there 

35 are no common steps in the proposed pathways of L-mycarose and D-desosamine 

biosynthesis. Rather than having similar sugar biosynthesis functiops, these polypeptides are 
predicted to be nucleotidyl-sugar transferases which, (in Sac. erythtaea at least), function to 



As set forth in FIG. 5C analogs of the sugar biosynthesis enzyme encoded by the 
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attach L-mycarose and D-desosamine to erythronolide B and 3-ct-mycarosylerythronolide B, 
respectively. 

As set forth in FIG. 5E analogs of the polypeptide encoded by the eryCVI gene have 
been identified in Streptomyces ambofaciens, Streptomyces purpurascens, and Rattus 
norvegicus. The various analogs have been identified with from 237-293 amino acid residues 
and are characterized by a low to moderate degree of amino acid identity. By way of 
example, the identity between the polypeptide encoded by the eryCVI gene of Sac. erythraea 
and the glycine methyltransferase of Rattus norvegicus is 26% at the amino acid level. 
Furthermore these sugar biosynthesis enzymes share a common sequence motif, 
LDV ACGTG (SEQ ID NO:30 = amino acid positions 64-7 1 in the consensus sequence in 
FIG. 5E), with rat glycine methyltransferase whose biochemical function is known (Ogawa et 
aU Eur. J. Biochem. 168: 141 (1987)). Thus these polypeptides are predicted to be N- 
methyltransferases. 

In another aspect, the present invention provides a recombinant C-4" keto reductase 
from Sac, erythraea . A recombinant Sac. erythraea C-4" ketoreductase of the present 
invention is a polypeptide of about 322 or less amino acid residues. A preferred recombinant 
Sac. erythraea C-4 M ketoreductase is that encoded by the nucleotide sequence of SEQ ID 
NO:2 from about nucleotide position 80 to about nucleotide position 1048. 

The present invention also contemplates amino acid residue sequences that are 
substantially duplicative of the sequences set forth herein such that those sequences 
demonstrate like biological activity to disclosed sequences. Such contemplated sequences 
include those analogous sequences characterized by a minimal change in amino acid residue 
sequence or type (e.g., conservatively substituted sequences) which insubstantial change does 
not alter the fundamental nature and biological activity of the aforementioned sugar 
biosynthesis enzymes. 

It is well known in the art that modifications and changes can be. made in the structure 
of a polypeptide without substantially altering the biological function of that peptide. For 
example, certain amino acids can be substituted for other amino acids in a given polypeptide 
without any appreciable loss of function. In making such changes, substitutions of like amino 
acid residues can be made on the basis of relative similarity of side-chain substituents, for 
example, their size, charge, hydrophobicity, hydrophilicity, and the like. 

As detailed in United States Patent No. 4,554,101, incorporated herein by reference, 
the following hydrophilicity values have been assigned to amino acid residues: Arg (+3.0); 
Lys (+3.0); Asp (+3.0); Glu (+3.0); Ser (+0.3); Asn (+0.2); Gin (+0.2); Gly (0); Pro (-0.5); 
Thr (-0.4); Ala (-0.5); His (-0.5); Cys (-1.0); Met (-1.3); Val (-1.5); Leu (-1.8); He (-1.8); Tyr 
(-2.3); Phe (-2.5); and Trp (-3.4). It is understobd that an amino acid residue can be 
substituted for another having a similar hydroj^ilicity value (e.g., within a value of plus or 
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minus 2.0) and still obtain a biologically equivalent polypeptide. 

In a similar manner, substitutions can be made on the basis of similarity in 
hydropathic index. Each amino acid residue has been assigned a hydropathic index on the 
basis of its hydrophobicity and charge characteristics. Those hydropathic index values are: 
5 lie (+4.5); Val (+4.2); Leu (+3.8); Phe (+2.8); Cys (+2.5); Met (+1.9); Ala (+1.8); Gly (-0.4); 
Thr (-0.7); Ser (-0.8); Trp (-0.9); Tyr (-1.3); Pro (-1.6); His (-3.2); Glu (-3.5); Gin (-3.5); Asp 
(-3.5); Asn (-3.5); Lys (-3.9); and Arg (-4.5). In making a substitution based on the 
hydropathic index, a value of within plus or minus 2.0 is preferred. 

io V, Production of novel glycosylated polyketides 

In another aspect, the present invention comprises a general procedure for producing 
novel polyketide structures in vivo by selectively altering, inactivating, or augmenting the 
genetic information of the organism that naturally produces a related polyketide. That is, in 
the present invention, novel polyketides of desired structure are produced by manipulation of 

15 the eryB and/or eryC genes followed by their introduction into various polyketide-producing 
microorganisms. These manipulations result in the formation of "glycosylation-modified" 
polyketides (i.e. polyketides having an altered glycosylation pattern or configuration relative 
to their native state). For example, "glycosylation-modified" polyketides are those which 
have additional sugar groups attached (where none previously existed), different sugars (such 

20 as sugar intermediates) attached in place of the natural sugars or lack sugar groups (at 
positions where sugar groups previously existed). 

In the case of type I and Type II alterations (further described below) glycosylation- 
modified polyketides may arise though mechanisms which cause either (1 ) the non- 
production of the sugar attachment enzyme (i.e. the enzyme involved in attachment of a sugar 

25 to the the polyketide structure) or (2) the non-production of a sugar biosynthesis enzyme. In 
the first instance, the sugar will not be attached to the polyketide since the enzyme which 
functions to attach the sugar will be lacking. In the second situation, a sugar intermediate 
from the biosynthesis pathway will be produced (depending on which enzyme is lacking) and 
attached to the polyketide provided it is recognized as a suitable substrate by the sugar 

30 attachment enzyme; alternatively, it will not be recognized and therefore, not attached. In the 
case of Type HI alterations (also described in detail below), glycosylation-modified 
polyketides arise via attachment of additional or different sugars (i.e. not normally found in a 
particular polyketide-producing strain) to the polyketide. It should be noted, that these 
postulated mechanisms are simply provided to enhance understanding of the novel processes 

35 described herein; the actual mechanisms by which the Type I, II and III alterations produce 
glycosylation-modified polyketides is not presently known. 

In the first type of alteration (referred to herein as Type I alterations), genetically 
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altered eryB and/or eryC genes are introduced into the chromosome of Sac. erythraea or 
another glycosylated polyketide-producing organism that also produces L-mycarose, D- 
desosamine, or their closely related derivatives such as mycaminose (4-hydroxy D- 
desosamine). The genetic alteration of an eryB and/or eryC gene is such that it causes a non- 
functional enzyme to be synthesized. Once introduced into an appropriate strain, the altered 
gene replaces its corresponding wild type gene causing the strain to lose the ability to 
produce a particular enzymatic activity involved in sugar biosynthesis. As a result, a 
glycosylation-modified polyketide is produced via either of the mechanisms previously 
described for a Type I alteration. 

In a Type I change described herein, a specific mutation in an eryB and/or eryC gene 
of the Sac. erythraea chromosome is accomplished by a three step process which involves: 
1) specifically altering the DNA sequence of a desired sugar biosynthesis gene, 2) subcloning 
the altered sequence into a suitable vector capable of recombining in the chromosome of an 
appropriate host and 3) introducing the vector containing the subcloned sequence into the 
appropriate host so that exchange of the wild type allele with the mutated one will occur. The 
first step is accomplished using standard recombinant DNA techniques to effect a deletion, 
base pair conversion or frame-shift in the DNA sequence. The second step, which also 
employs standard recombinant techniques, involves subcloning the altered sequence into a 
vector which does not replicate in Sac. erythraea or the desired host. In the final step, the 
vector is introduced into a suitable host, where by the process of gene replacement, the 
altered allele replaces the wild-type one. All techniques employed in a Type I change are 
well known to those of ordinary skill in the art. 

Example 1 illustrates the process of gene replacement of an eryB gene. As Example 1 
shows, the eryB gene of interest is mutated and along with adjacent upstream and 
downstream DNA sequences, cloned into a non-replicating Sac. erythraea plasmid vector. 
The vector carrying the mutated allele and adjoining DNA is then introduced into the host 
strain by the process of protoplast transformation. Transformants are regenerated under 
selective conditions (i.e. conditions that require expression of a particular plasmid marker) in 
order to induce recombination of the plasmid into the host cell chromosome. In other words, 
since the plasmid does not replicate autonomously, it must reside in the chromosome to be 
maintained in the cell and to express a particular marker under selective conditions. Insertion 
is achieved when the regenerated cells undergo a single homologous recombination between 
one of the two DNA segments that flank the mutation on the plasmid and its homologous 
counterpart in the chromosome. The cells are then grown without selection for the marker 
which induces plasmid loss from the chromosome. This loss arises after the cells have 
undergone a second recombination between the second DNA segment that flanks the 
^nutation and its homologous chromosomal counterpart. This second recombinational event 
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results in the loss of the plasmid sequences and the wild type allele from the chromosome; the 
mutant allele however is retained. 

In a variation of a Type I change, the non-production of the sugar biosynthesis 
enzyme (or attachment enzyme) may be achieved by the alternative mechanisms of promoter 
inactivation and/or transcriptional terminator insertion. These variations do not effect the 
gene sequence itself but rather regulatory mechanisms involved in gene transcription. 
"Promoter" as used herein refers to that region of a DNA molecule which controls the 
initiation of RNA transcription. Such regions are known to bind RNA polymerases (i.e. the 
enzymes involved in synthesizing RNA molecules). This form of Type I change (i.e. 
promoter inactivation) involves two steps of 1) identifying the promoter region of the desired 
gene and 2) rendering the promoter region inoperable by mutation. As in the replacement 
mechanism described above such mutations may be effected by creating deletions in the 
promoter sequence or by base pair conversion. In the case where the promoter controls 
transcription of a single gene, inactivation of the promoter wiH'eliminate expression of that 
particular gene; of course, where the promoter controls expression of an entire operon (i.e. a 
series of genes whose expression is controlled by a single promoter), promoter inactivation 
will effectively eliminate expression of all genes in that operon. 

In a similar manner, the non-production of a sugar biosynthesis enzyme (or 
attachment enzyme) may arise from inserting a transcriptional terminator upstream from the 
gene to be inactivated. A transcriptional terminator" as used herein is a nucleotide sequence 
which signals RNA polymerase to cease transcription. An example of a transcriptional 
terminator is a palindromic sequence capable of forming a stem-loop structure that is 
followed by a stretch of U residues (for example the transcriptional terminator that follows 
gene VIII of bacteriophage fd (Beck and Zink, Gene, 16:35 (1981)). Effecting a change in 
production of a sugar biosynthesis gene by this process involves 1) identifying of the gene or 
genes of interest (in the case of an operon aiTangement) to be inactivated and 2) cloning a 
transcriptional terminator sequence in a region of the DNA upstream from such gene(s). A 
transcriptional terminator will cause the polymerase involved in RNA transcription to stop (at 
or near the signaling region) thereby preventing transcription of any downstream sequences. 
Thus, changes such as promoter inactivation and transcriptional insertion, which directly 
effect expression of sugar biosynthesis genes are also intended to be within the scope of the 
invention. 

In the second case (referred to herein as Type II alterations) eryB and/or eryC genes 
are arranged on a vector in an antisense orientation relative to a promoter capable of allowing 
expression of the gene in Sac. erythraea or Streptomyces. The vector is then introduced into 
a polyketide producing microorganism. As a result of this vector construction, antisense 
messenger RNA (mRNA) is produced which interferes with the translation of the wild-type j 
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mRNA. Similarly to the Type I manipulation, novel glycosylation modified polyketides will 
be produced in which the normal mycarose, desosamine, and/or closely related sugar residue 
is lacking or is substituted by a sugar intermediate. 

In a Type II change, inactivation of the eryB and/or eryC genes by antisense 

5 expression is accomplished by a two step procedure in which (1 ) a specific sugar biosynthesis 
gene is subcloned into an expression vector in an antisense (i.e. reverse) orientation; and (2) 
the anti-sense expression vector is introduced into the desired strain. The first step is 
accomplished using standard recombinant DNA techniques employing either E. coli or 
Streptomyces as the host, and an expression vector (capable of replicating in either host) that 

jo can be assembled to contain a Streptomyces promoter. Streptomyces promoters may be 
obtained from any commercially available Streptomyces plasmids or Streptomyces- E. coli 
shuttle plasmids. In step 2, the anti-sense expression vector is introduced into a suitable 
Streptomyces strain and the transformed cells are grown under selective conditions in order to 
maintain the expression palsmid in the cell. 

15 As described in Example 2, the gene to be inactivated is subcloned in its reverse 

orientation downstream of a Streptomyces promoter (which is contained within a replicating 
Sac. erythraea plasmid). The plasmid carrying the antisense gene is then introduced into the 
host strain by protoplast transformation. Transformants are regenerated under selective 
conditions in order to maintain the autonomously replicating plasmid in the cells. Subsequent 

20 expression of the antisense gene causes the production of an antisense messenger RNA 
(mRNA) that is complementary to the mRNA of the native allele of the selected gene. 
Through standard nucleotide base pair interactions, the antisense mRNA and the native 
mRNA form an RNA duplex that occludes the ribosome binding site of the native mRNA. 
This interaction prevents ribosomal translation of the native mRNA and the corresponding 

25 synthesis of the enzyme encoded by that mRNA. In this way, specific enzymatic steps in 
sugar biosynthesis corresponding to the identity of the gene expressed in the antisense 
orientation are blocked leading to the production of novel sugar intermediates which, when 
attached to the polyketide ring of the host microorganism, give rise to novel glycosylation- 
modified polyketides. Alternatively, the antisense expression vector can be constructed using 

30 a non-replicating Sac. erythraea vector that includes flanking DNA from a nonessential 

region of the Sac. erythraea chromosome, such as the region immediately upstream from the 
eryK gene (FIG. 1). This vector can then be used to stably insert the antisense construction 
into the chromosome by homologous recombination in a fashion similar to that described for 
the construction of a Type I alteration. 

35 In the third case (referred to herein as Type III alterations), novel glycosylation- 

modified polyketides of desired structure are produced by arranging all or a subset of the 
eryB and/or eryC genes on a replicating vector and introducing these genes j?n bloc into a 
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"distinct" polyketide-producing organism, ie. one other than the microorganism from which 
the eryB and/or eryC genes were taken. As an example, eryB and/or eryC genes may be 
taken from Sac. erythreae and introduced into Streptomyces violaceoniger or Streptomyces 
Venezuelan In this case, mycarose, desosamine, their biochemical intermediates and/or their 
5 closely related derivatives will be synthesized and attached at specific positions to polyketide 
compounds that do not necessarily carry these, or any, sugar residues. Some examples of 
novel glycosylated polyketides that may be produced in hosts that carry such manipulations 
are shown in FIG. 6. 

In Type III changes, the genes for the biosynthesis of mycarose and/or desosamine are 
io introduced into a polyketide-producing organism other than Sac. erythraea by another simple 
two step procedure: I) all or a subset of the eryB and/or eryC genes are assembled together on 
a replicating plasmid downstream of a Streptomyces promoter; and 2) the plasmid is 
introduced into the polyketide-producing organism. Step 1 requires standard recombinant 
DNA manipulations employing E. coli and/or Streptomyces as the host. Step 2 requires one 
]5 or more plasmids out of the several Streptomyces vectors or E. coli-Streptomyces shuttle 
vectors available, one or more promoters that function in Streptomyces, and a selection for 
the presence of the strain carrying the plasmid. As described in Examples 3 and 4, sets of the 
eryB and/or eryC genes are sequentially subcloned together on a replicating vector 
downstream of a suitable promoter that functions in the desired host. The plasmid carrying 
20 the grouped genes is then introduced into the host strain by electroporation or by 
transformation of protoplasts employing selection for a plasmid marker. 

GENERAL METHODS 

25 Materials. Plasmids. and Bacterial Strains 

Restriction endonucleases, T4 DNA ligase, competent E. coli DH5<x cells, X-gal, 
IPTG and plasmids pUC18, pUC19, and pBR322 were purchased from Bethesda Research 
Laboratories (BRL), Gaithersburg, MD. VentR® DNA polymerase was purchased from New 

30 England Biolabs (Beverly, MA). Plasmids pGEM®5Zf , pGEM®7Zf , and pGEM® 1 lZf were 
from Promega, Madison, WI, plasmids pIJ4070 and pIJ702 were obtained from the John 
Innes Institute, Norwich, England, and plasmids pWHM3 and pWHM4 (/. Bacteriol 1989 
171:5872) were obtained from C R. Hutchinson, University of Wisconsin, Madison, WL 
[a- 32 P]dCTP, Hybond™-N nylon membranes, and Megaprime nick translation kits were 

35 from Amersham Corp., Chicago, IL. SeaKem® LE agarose and SeaPlaque® low gelling 
temperature agarose were from FMC Bioproducts, R-ockland, ME. E. coli K12 strains 
carrying the E. coli-Sac. erythraea shuttle plasmids i>WHM3 and pWHM4 (Vara et a/., J 
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Bacteriol, 171:5872 (1989)) and pAIX have been deposited at the Agricultural Research 
Culture Collection (NRRL) 1815 N. University Street, Peoria, Illinois 61604, as of 
December 5, 1995, under the terms of the Budapest Treaty and will be maintained for a 
period of thirty (30) years from the date of deposit, or for five (5) years after the last request 

5 for the deposit, or for the enforceable period of the U.S. patent, whichever is longer. 
Plasmids pWHM3, pWHM4 and pAIX were accorded the accession numbers NRRL B- 
21512, NRRL B-21513 and NRRL B-21 514, respectively. Sac. erythraea strain NRRL2338 
is also available from the Agricultural Research Service culture collection. Staphylococcus 
aureus Th R (thiostrepton resistant) was obtained by plating 10 8 cells of 5. aureus on agar 

10 medium containing 10 p.g/ml thiostrepton and picking a survivor after 48 hr growth at 37°C. 
Thiostrepton was obtained from Sigma Chemical, St. Louis, MO. All other chemicals and 
reagents were from standard commercial sources unless otherwise specified. 

DNA Manipulations 

15 Standard conditions were employed for restriction endonuclease digestion, agarose 

gel-electrophoresis, isolation of DNA fragments from low melting agarose gels, DNA 
ligation, plasmid isolation from E. coli by alkaline lysis, and transformation of E. coli 
employing selection for ampicillin resistance (150 Jig/ml) on LB agar plates (Sambrook et al. 
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Plainview, 

20 NY, 1989). Total DNA from Sac. erythraea and Streptomyces species (including S. fradiae, 
S. celestes, S. violaceoniger, S. hygroscopicus, S. venezuelae) was prepared according to 
described procedures (Hopwood era/., Genetic Manipulation of Streptomyces, A Laboratory 
Manual, John Innes Foundation, Norwich, UK (1985)). Transfer of DNA from agarose gels 
to Hybond™-N membranes and Southern analysis using Megaprime™ nick translated probes 

25 was performed according to the manufacturers instructions. 

Amplify™ nf DNA Fragments 

Synthetic deoxyoligonucleotides were synthesized on an ABI Model 380A 
synthesizer (Applied Biosystems, Foster City, CA) following the manufacturers 

30 recommendations. Amplification of DNA fragments was performed by the polymerase chain 
reaction (PCR) using a Perkin Elmer GeneAmp® PCR System 9600. Reactions contained 
100 pmol of each primer, 1 \Lg of template DNA (chromosomal DNA from Sac. erythraea 
NRRL2338), 2 units VentR® DNA polymerase in 100 \il volume of PCR buffer (10 mM KC1, 
10 mM (NH^)2S04, 20 mM Tris-HCl (pH 8.8, @ 25°C), 2.5 mM MgS0 4 , 0.1% Triton® X- 

35 100) containing dATP (200 jiM), dTTP (200 uM), dCTP (250 uM), and dGTP (250 \iM). 
The reaction mixture was subjected to 30 cycles. Each cycle consisted of one period of 35 
sec at 96'C and one period of $ min at 72'C. The reaction products were visualized and 
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purified from low melting agarose. The PCR primers described in the examples were derived 
from the nucleotide sequence of the eryB and eryC genes of FIG. 4. 

Transformation and Gene Replacement in Sac, ervthraea 

5 Protoplasts of Sac. erythraea strains were prepared and transformed with miniprep 

DNA isolated from £. coli according to published procedures (Yamamoto et al, J 
Antibiotics, 39:1304 (1986)). Non-integrative transformants, in the case of pWHM4 
derivatives, were selected by regenerating the protoplasts and overlaying with thiostrepton 
(final concentration 20 Jig/ml) as described (Weber et al. t Gene, 68: 173 (1988)). Integrative 

io transformants, in the case of pWHM3 derivatives, were selected on thiostrepton-containing 
agar plates (15 Jig/ml) as described by Weber etal, Gene, 68:173 (1988). Loss of the Th R 
phenotype was monitored after two rounds of non-selective growth in SGGP media 
(Yamamoto et al, J Antibiotics, 39:1304 (1986)) followed by protoplasting and serial 
dilution on non-selective agar media. Regenerated protoplasts were replica plated on 

is thiostrepton-containing media. Th^ (thiostrepton-sensitive) colonies arose at a frequency of 
10" 1 . Retention of the mutant allele was established by Southern hybridization of several 
Th$ colonies. 

Fermentation 

20 Sac. erythraea or Streptomyces cells are inoculated into 1 00 ml SCM medium (1.5% 

soluble starch, 2.0% Difco Soytone, 0.15% Yeast Extract, 0.01% CaCl2) and allowed to grow 
for 3 to 6 days. The entire culture is then inoculated into 10 liters of fresh SCM medium. 
The fermenter is operated for a period of 4 to 7 days at 32°C maintaining constant aeration 
and pH at 7.0. After the fermentation is complete, the cells are removed by centrifiigation at 

25 4°C and the fermentation beer is kept cold until further use. When antibiotic selection to 
maintain a plasmid, such as pXC4 or pXB6, is required, thiostrepton (10^g/ml) is added to 
both the 100 ml starter culture and the 10-liter fermenter. 

The invention will be better understood in connection with the following examples, 
30 which are intended as an illustration of and not a limitation upon the scope of the invention. 
Both below and throughout the specification, it is intended that citations to the literature be 
expressly incorporated by reference. 

Example 1: Construction and characteri zation of Sac, ervthraea ERBIV that produces 
35 4"-deoxv-4"-o xo^mhromvcin A 

i 

I A/ Construction of Plasmid pRBIV : A 4.3 kb Pstl-HindUl fragment, which included 
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the eryBIV gene, was isolated from the plasmid pAIXS and subcloned into Pstl-HindRl 
digested pUC19 to generate plasmid pUCBIV. After transformation and isolation of the 
plasmid from £. coli, the identity of pUCBIV was confirmed by digestion with Muni which 
released a fragment of 370 bp. Plasmid pUCBIV was then cut with the restriction enzyme 
Ncol, the restriction site filled in with Klenow enzyme, and the plasmid religated to generate 
plasmid pNCOBIV, (which now carried a frameshift mutation in the eryBIV gene). After 
transformation and isolation of the plasmid from E. coli t the identity of pNCOBIV was 
confirmed by digestion with NsH and HindW which released a fragment of 1 .59 kb. (The 
Nsil site was formed by the fill-in and religation of the Ncol site.) Finally, plasmid 
pNCOBIV was digested with HindUI and Sstl and the 3.2 kb fragment carrying the altered 
eryBIV gene was isolated and ligated into HindSl and Sstl digested pWHM3 to generate 
plasmid pRBIV. After transformation and isolation of the plasmid from E. coli, the identity 
of pRBIV was confirmed by digestion with Kpnl which released fragments of 5.2 kb, 4.4 kb, 
and 072 kb. 

B. Construction of Sac, ervthraea ERBIV: Sac. erythraea protoplasts were 
transformed with plasmid pRBIV and integrative transformants selected as described in 
General Methods. Resolution of the integrants by nonselective growth as described in 
General Methods yielded Sac. erythraea ERBW in which the wild type copy of the eryBIV 
gene was replaced with the inactive mutant copy. Gene replacement was confirmed by 
Southern analysis of Ncol digested Sac. erythraea DNA and Ncol-Nsil digested Sac. 
erythraea DNA using the 1 .58 kb Ncol-HindTll fragment isolated from plasmid pUCBIV 
(coordinates 68 1-2214, FIG. 4B) as a probe. Wild type Sac. erythraea and wild type 
resolvants display a hybridizing DNA fragment of 2.75 kb when digested with either Ncol or 
Ncol-Nsil, whereas Sac. erythraea strain ERBIV is characterized by hybridization to either a 
16 kb DNA fragment or a 2.75 kb DNA fragment when digested with Ncol or Ncol-Nsil, 
respectively. 

C. Isolation, purification, and properties of 4"-deoxv-4"-oxo-ervthromvcin A from 
Sac, ervthraea ERBIV : Sac. erythraea strain ERBIV is fermented for 4 days in SCM media 
as described in General Methods. The fermentation broth of Sac. erythraea ERBIV is then 
cooled to 4°C and adjusted to pH 4.0 and extracted once with methylene chloride. The 
aqueous layer is readjusted to pH 9.0 and extracted twice with methylene chloride and the 
combined basic methylene chloride extracts are concentrated to a solid residue. This is 
digested in methanol and chromatographed over a column of Sephadex LH-20 in methanol. 
Fractions are tested for bioactivity against a sensitive organism, such as Staphylococcus 
aureus Th R , and active fractions are combined. The combined fractions are concentrated and 
the residue is digested in 1 0 ml of the upper phase of a solvent system consisting of n- 
heptane, benzene, acetone, isopropanol, 0.05 M, pH 7.0 aqueous phosphate buffer 
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(5: 10:3:2:5, v/v/v/v/v), and chromatographed on an Ito Coil Planet Centrifuge in the same 
system. Active fractions are combined, concentrated and partitioned between methylene 
chloride and dilute ammonium hydroxide (pH 9.0). The methylene chloride layer is 
separated and concentrated to yield the desired product as a white foam. 

5 

Example 2: Construction and characterization of Sac, ervthraea E R720fpASBVrD that 
produces 3-fx-D-mvcarosvl-5-B>D-desosamtnovl -12-hvdroxv-ervthronolide B 

A. Construction of plasmid pASX2 (see FIG. 7) : The 290 bp EcoRl-BamHl segment 
io carrying the ermE* promoter is isolated from plasmid pIJ4070 and ligated into EcoRI-BamHl 

digested pWHM4 DNA to form pASXl . After transformation and isolation of the plasmid 
from E. coli, the identity of pASXl is confirmed by digestion with ApaLl which releases 
fragments of 3.9 kb, 2.5 kb, 1 .2 kb f 0.5 kb, and 0.4 kb. Two oligonucleotides of the 
sequences: SEQIDNO:31 (5-GATCCAGCGTCTGCAGGCATGCTCTAGATACAATTA 

15 AAGGCTCCTTTTGGAGCC'1114'll H lUlGGAGATTTrCAACGT-3 ) and 

SEQ ID NO:32 ( 5'- AGCT ACGTTG A A A ATCTCC A A AAA A A A AGGCTCC AAA A 
GGAGCCTTTAATTGTATCTAGAGCATGCCTGCAGACGCTG-3 ), corresponding to the 
(+) and (-) strands of the bacteriophage fd gene VIII transcription terminator (t-fd) (Beck et 
al ( 1978) Nucl. Acids Res. 5:4495])and including restriction enzyme sites for the enzymes 

20 Pstl, SphI, and Xbal, and overhanging ends compatible with BamHl and HindlR are 

synthesized and approximately 250 ng of each oligonucleotide are then mixed together in TE 
buffer and heated to 99°C for 1 min. The solution is cooled slowly to room temperature 
allowing the oligonucleotides to anneal due to self complementarity, and the annealed 
oligonucleotides are then ligated into flamHI/fwdlll digested pASXl to give pASX2. After 

25 transformation and isolation of the plasmid from E. coli, the identity of pASX2 is confirmed 
by DNA sequencing of the 1.2 kb EcoRVSaH fragment that contains the ErmE* promoter and 
the bacteriophage fd terminator. 

B. Construction of plasmid pASBVII (see FIG. 8) : The 598 base pair DNA segment 
that carries the eryBVII gene, comprising coordinates 7398-7996 (FIG. 4B), is amplified by 

30 PCR employing two oligonucleotides, SEQ ID NO:33 (5- 

GATCGCATGCTCTAGAGTACG-TGAGCTGGCGGTGGCGGGC-3 ) and SEQ ID NO:34 
(5 -GATCCGGATCCGCATGCTT-CACCTGCCGGTGCTGGCGGG-3 ). After digestion of 
the purified PCR product with BaniHI-Xbal the PCR fragment was ligated to BaniHI-Xbal 
digested pASX2 to give pASBVU. After transformation and isolation of the plasmid from E. 

35 coli, the identity of pASB VII is verified by DNA sequencing of the 880 bp EcoKl-Xbal 
insert. i 

C. Construction of Sac, ervthraea ER72(ypASBVIIV . Sac erythraea strain|ER720 
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protoplasts are transformed with plasmid pASBVII and transformants are selected for with 
thiostrepton (15 \igtm\). To confirm transformation, total DNA is isolated from Th R colonies 
and used to transform E. coll After transformation and isolation of the plasmid from £ coll 
the identity of pASBVII is verified by restriction analysis with the enzymes PvuR and BamHl 

5 which releases a 1 .48 kb fragment. Those Sac. erythraea colonies that are found to contain 
pASBVII are designated Sac. erythraea ER720(pASB VII). 

D. Isolation, purification, and properties of 3-a-P-mvcarosvl-5-g-D-desQsamipov l- 
12-hvdroxv-ervthronolide B from Sac, erythraea ER720fpASBVJI) : Sac. erythraea 
ER720(pASBVII) is fermented for 3 days in SCM media with thiostrepton selection as 

10 described in General Methods. The fermentation broth is then cooled to 4°C and adjusted to 
pH 4.0 and extracted once with methylene chloride. The aqueous layer is readjusted to pH 
9.0 and extracted twice with methylene chloride and the combined extracts are concentrated 
to a solid residue. This is digested in methanol and chromatographed over a column of 
Sephadex LH-20 in methanol. Fractions are tested for bioactivity against a sensitive 

15 organism, such as Staphylococcus aureus Th R , and active fractions are combined. The 

combined fractions are concentrated and the residue is digested in 10 ml of the upper phase of 
a solvent system consisting of n-heptane, benzene, acetone, isopropanol, 0.05 M, pH 7.0 
aqueous phosphate buffer (5:10:3:2:5, v/v/v/v/v), and chromatographed on an Ito Coil Planet 
Centrifuge in the same system. Active fractions are combined, concentrated and partitioned 

20 between methylene chloride and dilute ammonium hydroxide (pH 9.0). The methylene 
chloride layer is separated and concentrated to yield the desired product as a white foam. 

Example 3: Con stmction and characterization of Strentomvces antibiotic^ ATCC 
1 189UpXB6) that produces 3 -des-oleandrosvl-3-mvcarosvl oleandomycin 

25 

A. Constmction of plasmid pKB6 and intennediates (see FIG, 9) 

i) Construction of plasmid pKl : The DNA sequences of pBR322 (GenBank 
Accession #: J01749) and pUC19 (GenBank Accession #: X02514) are known. The 805 nt 
DNA segment comprising coordinates 1673 through 2478 of pBR322 is amplified by PCR 

30 employing two oligodeoxynucleotides, SEQ ID NO:35 (5 -GATCACATGTTCTTTCCTG- 
CGTTATCCCCTG-3') and SEQ ID NO:36 (5-GATCGGATCCATGCATGTCTAGAGCA- 
TCGCAGGATGCTGCTGGC-3). After digestion of the purified PCR product with AflVLl 
and BamHl the fragment is ligated into Aflm and BamHl digested pUC19 to give plasmid 
pK 1 . The identity of plasmid pK 1 , after transformation and isolation from E. coli, is verified 

35 by PvuU digestion which releases fragments of 0.55 kb and 2.55 kb. Plasmid pKl contains 
the ROP region of pBR322 that controls plasmid copy number. i 

ii) Construction of plasmid pKBl : The 2.24 kb D$ A segment that carries the 
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eryBIV and eryBV genes, comprised between coordinates 56 and 2296 of the sequence 
presented in SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides, 
SEQ ID NO:37 (5'-GAATGCATCCTGGAAAGCGAGCAAATGCTCCGGTG-3 ) and SEQ 
ID NO:38 (5'-GATCTAGAGCTAGCCGGCGTGGCGGCGCGTG-3'). After digestion with 
Mil and Xbal the fragment is ligated into Nsil and Xbal digested pKl to yield plasmid pKBl, 
5.3 kb in size. The identity of plasmid pKB 1 , after transformation and isolation from E. coli, 
is verified by Kpnl digestion which releases fragments of 0.72 kb, 1 . 14 kb and 3.42 kb. 

iii) Construction of plasmid pKB2 : The 1 .56 kb DN A segment that carries 
the eryBVI gene, comprised between coordinates 3 1 21 and 4677 of the sequence presented in 
SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides, SEQ ID NO:39 
(S -GATCGCTAGCCGTGACCGGACCCTTACAGTGAGTG-S') and SEQ ID NO:40 
(S'^ATCTAGACTTAAGTCATCCGGCGGTCCTGGTGTAGACGGC-S'). After digestion 
with Nhel and Xbal the fragment is ligated into Nhel and Xbal digested pKB 1 to give plasmid 
pKB2, 6.9 kb in size. The identity of plasmid pKB2, after transformation and isolation from 
E. coli, is confirmed by BaniHl digestion which releases fragments of 0.22 kb, 0.40 kb, 2.6 
kb and 3.7 kb. 

iv) Construction of plasmid pKB3 : The 0.6 kb DNA segment that carries the 
ery BVIl gene, comprised between coordinates 7385 and 7987 of the sequence presented in 
SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides, SEQ ED NO:41 
(S'-GATCTTAAGAACCGGAGTTGCGAGTACGTGAGCTGGCG-S') and SEQ ID NO:42 
(S'-GATCTAGACCTAGGTCACCTGCCGGTGCTGGCGGGCTC-S ). After digestion with 
AfRl and Xbal the fragment is ligated into4/ffl and Xbal digested pKB2 giving plasmid 
pKB3, 7.5 kb in size. The identity of plasmid pKB3, after transformation and isolation from 
E. coli, is verified by Pstl digestion which releases fragments of 1.1 kb and 6.4 kb. 

v) Construction of plasmid P KB4: The 1.0 kb DNA segment that carries the 
eryBII gene, comprised between coordinates 2385 and 3410 of the sequence presented in 
SEQ ID NO: 1, is amplified by PCR employing two deoxyoligonucleotides, SEQ ID NO:43 
(5 -GATCCTAGGCCGCAGGAAGGAGAGAACCACG-3') and SEQ ED NO:44 

(5 '-G ATCT AG ATTAATCACTGCAACC AGGCTTCCGGC-3" ) . Following digestion with 
Avrll and Xbal the fragment is ligated into Avrll and Xbal digested pKB3 yielding the desired 
plasmid pKB4. After transformation and isolation of the plasmid from £ coli, the identity of 
pKB4, 8.5 kb in size, is verified by BgtU. and EcoRI digestion which releases fragments of 
0.41 kb, 1.6 kb, 3.1 kb and 3.4 kb. 

vi) Construction of plasmid pKB5: The DNA sequence of eryBIII has been 
reported (Haydock et al (1991) Mot Gen Genet 230:120). The 1.3 kb DNA segment that 
carries the eryBIII gene, comprised between coordinates 3965 and 5232 of the sequence 
depicted in Haydock et al, is amplified bylPCR employing two deoxyoligonucleotides, SEQ 
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ID NO:45 (5 -GATTAATTGGCCGCGGCGCCGCGCrC-GTTATG-3 ) and SEQ ID NO:46 
(S'-GATCTAGATAATTAATCATACGACTTCCAGTC-GGGGTAG-S'). After digestion 
with Msel and Xbal the fragment is ligated into Msel and Xbal digested pKB4 to give the 
desired plasmid pKB5, 9.8 kb in size. The identity of pKB5, after transformation and 
isolation from E. coli y is verified by Pstl digestion which releases fragments of 1,1 kb, 2.5 kb, 
and 6.1 kb, visualized by gel electrophoresis. 

vii) Construction of plasmid pKB6 : The eryBI gene has been mapped 
(Haydock et al (1991) Mol Gen Genet 230: 120) and the DNA sequence on both flanks of 
eryBI is known (Haydock et al (1991) Mol Gen Genet 230:120) and GenBank Accession # 
Ml 1200. The 2.5 kb DNA segment that carries the eryBI gene, comprised between 
coordinates 1.1 and 3.6 of the map presented in Haydock etal., is amplified by PCR 
employing two deoxyoligonucleotides: SEQ ID NO:47 (5-GATTAATTAATGATCA- 
AGCTGAAAATTGTTTGCATG-3 , ) and SEQ ID NO:48 (S-GATCTAGACTGCCGGCT- 
CAGCCTTCCCAGGTTCG-3'). After digestion with Pacl and Xbal the fragment is ligated 
into Pacl and Xbal digested pKB5 to give plasmid pKB6, 12.3 kb in size. The identity of 
pKB6, after transformation and isolation from E. coli, is verified by BamHI digestion which 
releases fragments of 0.22 kb, 0.40 kb, 1 .4 kb, 2.6 kb, 3.3 kb and 4.4 kb. Plasmid pKB6 
carries all of the eryB genes, eryBl-eryBVII, that are involved in the biosynthesis of mycarose 
and its attachment to the polyketide. 

B. Construction of P lasmid P XSB6 (see FIG. 1 1): The 9.2 kb Nsil-Xbal segment of 
pKB6, prepared as described in Example 3(A)(vii) above, that carries all of the eryB genes is 
isolated and ligated into PstVXbal digested pASX2, prepared as described in Example 2(A) 
above, to give plasmid pXSB6. After transformation and isolation of the plasmid from £. 
coli, the identity of pXSB6, 17.2 kb in size, is verified by the observation of fragments of 
0.4 1 kb, 1 .9 kb, and 14.9 kb after EeoRL digestion. Plasmid pXSB6 carries all of the eryB 
genes in a transcriptional fusion downstream of the ermE* promoter on an E. coli- 
Streptomyces shuttle plasmid. 

C Construction of Plasmid pXB6 

i) Construction of plasmid pN7 02 (see FIG. 10): Two oligonucleotides of the 
sequences: SEQ ID NO:49 5'-GGAATTCAGATCTATGCATTCrAGAA-3 ) and 
SEQ ID NO:50 (5 -CGCGTTCTAGAATGCATAGATCrGAATTCCrGCA-3 ) that include 
restriction enzyme sites for the enzymes fcoRI, BglH^Nsil and Xbal and overhanging ends 
compatible with Pstl and Mlul are synthesized. Approximately 250 ng of each 
oligonucleotide are then mixed together in TE buffer and heated to 99°C for 1 min. After the 
solution is cooled slowly to room temperature allowing the oligonucleotides to anneal due to 
self complementarity, foe annealed oligonucleotides are ligated into Pstl-Mlul digested 
pIJ702 to yield the delired plasmid pN702. After transformation and isolation of the plasmid 



WO 97/23630 



PCT/US96/20238 



27 

from Streptomyces lividans 1326, the identity of plasmid pN702, 4.3 kb in size, is verified by 
the observation of fragments of 0.75 kb and 3.6 kb after EcoRl-BamHL or Xbal-BamUl 
digestion. 

ii) Construction of plasmid pXl (see FIG. 10Y The 290 bp EcoRI-BamHl 
5 segment that carries the ermE* promoter is isolated from plasmid pIJ4070 and ligated into 
EcoRI-BgRl digested pN702 to give plasmid pXL The resulting mixture contains the desired 
plasmid pXl . After transformation and isolation of the plasmid from Streptomyces lividans 
1326, the identity of plasmid pXl, 4.6 kb in size, is verified by the observation of fragments 
of 1 .0 kb and 3.6 kb after Nsil-BamHl digestion. 
10 iii) Construction of plasmid pXB6 (see FIG. 1 H : The 9.2 kb NsiL-Xbal 

segment of pKB6, prepared as described in Example 3(A)(vii) above, that carries all of the 
eryB genes is isolated and ligated into Nsil-Xbal digested pXl to give the desired plasmid 
pXB6. After transformation and isolation of the plasmid from Streptomyces lividans 1 326, 
the identity of plasmid pXB6, 13.8 kb in size, is verified by the observation of fragments of 
15 0,4 1 kb, 1 .9 kb, and 1 1 .5 kb after EcoRl digestion. Plasmid pXB6 carries all of the eryB 
genes in a transcriptional fusion to the ermE* promoter on a Streptomyces plasmid. 

D. Construction of Streptomyces antibioticus ATCC 1 1891(pXB6) : Approximately 
500 fig of plasmid pXB6, isolated from Streptomyces lividans 1326(pXB6), are 
electroporated into the oleandomycin producer Streptomyces antibioticus ATCC 1 1891 and 

20 several of the resulting Thio^ colonies that appear on the R3M-agar plates containing 

thiostrepton are analyzed for their plasmid content. The presence of plasmid pXB6, 13.8 kb 
in size, is verified by the observation of fragments of 0.41 kb, 1 .9 kb, and 1 1 .5 kb after EcoRI 
digestion. 

E. Isolation, purification, and properties of 3-des-oleandrosvI-3-mycarosvl 
25 oleandomycin from Streotomvces antibioticus ATCC 1 1891foXB6): Streptomyces 

antibioticus ATCC 1 1891(pXB6) is fermented for 5 days in SCM media with thiostrepton 
selection as described in General Methods. The fermentation broth is then cooled to 4°C and 
adjusted to pH 4.0 and extracted once with methylene chloride. The aqueous layer is 
readjusted to pH 9.0 and extracted twice with methylene chloride and the combined extracts 

30 are concentrated to a solid residue. This is digested in methanol and chromatographed over a 
column of Sephadex LH-20 in methanol. Fractions are tested for bioactivity against a 
sensitive organism, such as Staphylococcus aureus Th^, and active fractions are combined. 
The combined fractions are concentrated and the residue is digested in 10 ml of the upper 
phase of a solvent system consisting of n-heptane, benzene, acetone, isopropanol, 0.05 M, pH 

35 7.0 aqueous phosphate buffer (5:10:3:2:5, v/v/v/v/v), and chromatographed on an Ito Coil 
i Planet Centrifuge in the same system. Closely eluting active fractions are combined, 
I concentrated and partitioned between methylene chloride and dilute ammonium hydroxide 
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(pH 9.0). The methylene chloride layer is separated and concentrated to yield the desired 
product as a white foam. 

Sam ple 4: Construction and characterizatio n of Streotomvces violaceomner NRRL 
2834(pXC4) that produces 5-des-chalcosvl -5-desosaminovl lankamvcin 

A. Construction of plasmid n KC4 and intermediates (see FIG- 12) 

i) Construction of plasmid pKCI : The 2.4 kb DNA segment that carries the 
eryCII and eryCIII genes, comprised between coordinates 33 and 2413 of the sequence 
presented in SEQ ID NO:l , is amplified by PCR employing two deoxyoligonucleotides, 
SEQ ID NO:51 (5 , -GAATGCATCTGGCTGGGCGGAGGGAATTCATG-3') and 

SEQ ID NO:52 (5'-GATCTAGACTTAAGTCATCGTGGTTCTCTCCTTCCTGC 
GGC-3'). After digestion with Nsil and Xbal the purified PCR fragment is ligated into Nsil 
sndXbal digested pKl to give plasmid pKCL, 5.5 kb in size. The identity of plasmid pKCI, 
after transformation and isolation from E. coli, is verified by EcoRI digestion which releases 

fragments of 2.2 kb and 3.3 kb. 

ii) Construction of plasmid nKC2 : The 732 bp DNA segment that carries the 
eryCVI gene, comprised between coordinates 2331 and 3063 of the sequence presented in 
SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides, 

SEQ ID NO:53 (S'-GATCCTTAAGCTCCGGAGGGAGCAGGGATG-S') and 
SEQ ED NO:54 (5'-GATCTAGACCTAGGTCATCCGCGCACACCGACGAAC-3'). After 
digestion with Afttl and Xbal the purified PCR fragment is ligated into Aflll and Xbal 
digested pKCI to give plasmid pKC2, 6.2 kb in size. The identity of plasmid pKC2, after 
transformation and isolation from E. coli, is verified by Xbal-EcoW digestion which releases 
fragments of 0.95 kb, 2.2 kb and 3. 1 kb. 

iii) Construction of p lasmid P KC3: The 2.7 kb DNA segment that carries the 
eryCIV and eryCV genes, comprised between coordinates 4650 and 7386 of the sequence 
presented in SEQ ID NO:2, is amplified by PCR employing two deoxyoligonucleotides, 
SEQ ED NO:55 (5 -GATCCTAGGCCGTCTACACCAGGACCGCCGG-3 ) and 

SEQ ED NO:56 (S'-GATCTAGATTAATCACCTTCCGCGCAGGAAGCCGC-S'). After . 
digestion with Avrll and Xbal the purified PCR fragment is ligated intoAvrll and Xbal 
digested pKC2 to yield plasmid pKC3, 9.0 kb in size. The identity of plasmid pKC3, after 
transformation and isolation from E. coli, is verified by Sphl digestion which releases 

fragments of 4.0 kb and 5.0 kb. 

iv) Construction of plasmid nKC4 : The DNA sequence of the eryCI gene has 
been determined (GenBank Accession #X15541). The 1.1 kb DNA segment that carries thet 
eryCI gene, comprised between coordinates 38 and 1 161 of the sequence indicated above, ij 
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amplified by PCR employing two deoxyoligonucleotides, SEQ ID NO:57 (5-GATC1 1 AAG- 
CCGCCACTCGAACGGACACTCG-3) and SEQ ID NO:58 (5-GATCTAGATCAAGCCC- 
CAGCCTTGAGGG-3'). After digestion with Msel and Xbal the fragment is ligated into 
Msel and Xbal digested pKC3 to give plasmid pKC4, 10.1 kb in size. The identity of plasmid 
pKC4, after transformation and isolation from E. coli, is verified by Kpnl digestion which 
releases fragments of 0.15 kb, 0.31 kb, 4.1 kb and 5.5 kb. Plasmid pKC4 carries all of the 
eryC genes, eryCI-eryCVI, that are involved in the biosynthesis of desosamine and its 
attachment to the polydetide. 

B. Construction of Plasmid pXSC4 (see FIG. 13) : The 6.9 kb Nsil-Xbal segment of 
pKC4 that carries all of the eryC genes is isolated and ligated into Pstl-Xbal digested pASX2, 
prepared as described in Example 2(A), to give the desired plasmid pXSC4, 14.9 kb in size, 
wherein all of the eryC genes are transcriptionally linked downstream of the ermE* promoter 
on an £. coli-Streptomyces shutde plasmid. The identity of plasmid pXSC4, after 
transformation and isolation from E. coli, is verified by the observation of fragments of 0.29 
kb, 2.2 kb, and 1 2.4 kb after EcoRl digestion . 

C. Construction of Plasmid pXC4 (see FIG. 13) : The 6.9 kb Nsil-Xbal segment of 
pKC4 that carries all of the eryC genes is isolated and ligated into Nsil-Xbal digested pXl , 
prepared as described in Example 3(C)(ii), to give the desired plasmid pXC4, 1 1.5 kb in size, 
wherein all of the eryC genes are transcriptionally linked downstream of the ermE* promoter 
on a Streptomyces plasmid. After transformation and isolation of the plasmid from 
Streptomyces lividans 1326, the identity of plasmid pXC4 is verified by the observation of 
fragments of 0.29 kb, 2.2 kb, and 9.0 kb after EcoRl digestion. 

D. Construction of Streptomyce s violaceoniger NRRL 2834(pXC4): Approximately 
500 |ug of the plasmid pXC4, isolated from Streptomyces lividans 1326(pXC4) , are 
electroporated into the lankamycin producer Streptomyces violaceoniger NRRL 2834 and 
several of the resulting Thio R colonies that appear on the R3M-agar plates containing 
thiostrepton are analyzed for their plasmid content. The presence of plasmid pXC4 is verified 
by the observation of fragments of 0.29 kb, 2.2 kb, and 9.1 kb in size after EcoRI digestion 
of the plasmid. 

E. Isolation, purification, and propertie s of 5-des-chalcosvl-5-desosaniinoyl 
lankamvcin : S. violaceoniger NRRL 2834(pXC4) is fermented for 5 days in SCM media 
with thiostrepton selection as described in General Methods. The fermentation broth is then 
cooled to 4°C and adjusted to pH 4.0 and extracted once with methylene chloride. The 
aqueous layer is readjusted to pH 9.0 and extracted twice with methylene chloride and the 
combined extracts are concentrated to a solid residue. This is digested in methanol and 
chromatographed over a column of Sephadex LH-20 in methanol, i Fractions are tested for 
bioactivity against a sensitive organism, such as Staphylococcus Jjpreus Th R , and active 
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fractions are combined. The combined fractions are concentrated and the residue is digested 
in 10 ml of the upper phase of a solvent system consisting of n-heptane, benzene, acetone, 
isopropanol, 0.05 M, pH 7.0 aqueous phosphate buffer (5:10:3:2:5, v/v/v/v/v), and 
chromatographed on an Ito Coil Planet Centrifuge in the same system. Active fractions are 
combined, concentrated and partitioned between methylene chloride and dilute ammonium 
hydroxide (pH 9.0). The methylene chloride layer is separated and concentrated to yield the 
desired product as a white foam. 

Although the present invention is illustrated in the examples listed above in terms of 
preferred embodiments, these examples are not to be regarded as limiting the scope of the 
invention. The above illustrations serve to describe the principles and methodologies 
involved in creating the types of genetic alterations that can be introduced into Sac. erythraea 
and/or other Streptomyces that result in the synthesis of novel glycosylation-modified 
polyketide products. Although a single Type I alteration, leading to the production of for 
example, 4 M -deoxy-4 H -oxo-erythromycin A, is specified herein, it is obvious to those skilled 
in the art that other Type I changes can be introduced into the eryB and/or eryC genes leading 
to novel glycosylation-modified polyketide structures. Examples of additional Type I 
alterations leading to useful novel compounds include but are not limited to: mutations in the 
eryBVIl gene conceivably leading to 3-a-D-mycarosyl-5-B-D-desosaminoyl-12-hydroxy- 
erythronolide B and mutations in the eryCVI gene conceivably leading to N-3a'-des-dimethyl 
erythromycin A. Moreover, it is obvious that Type I alterations in two or more different eryB 
and/or eryC genes can be combined leading to novel glycosylation-modified polyketide 
structures. Examples of combinations of two Type I alterations leading to useful compounds 
include but are not limited to: mutations in the eryBIV and eryBVIl genes conceivably leading 
to 3-a-D-4 ,, -deoxy-4"-oxo-mycaix)syl-5-B-D-desosaminoyI-12-hydroxy-erythronolide B; 
mutations in the eryBIV md eryCVI genes conceivably leading to 4 , '-deoxy-4 ,, -oxo-(N-3a- 
des-dimethyl)-erythromycin A; and mutations in the eryBTV> eryBVIl , and eryCVI genes 
conceivably leading to 3-a-D-4"-deoxy-4 ,, -oxo-mycarosyl-5-&-D-(N-3a-des-dtmethyl)- 
desosaminoyl-12-hydroxy-erythronolide B. All Type I mutations or combinations of two or 
more Type I mutations in the eryBIl eryBIV, eryBV f eryBVl eryBVIl eryCIl eryCIII, 
eryCIV, eryCV t or eryCVI genes, the Sac. erythraea strains that carry said mutations or 
combinations of mutations, and the corresponding polyketides produced from said strains, 
therefore, are included within the scope of the present invention. 

Although the Type II mutation specified herein was constructed with the eryBVIl gene 
on a self-replicating plasmid it is obvious that other eryB genes and eryC genes can be 
expressed in an antisense orientation leading to novel glycosylation-modified polyketide 
structures. Examples of additional Type II alterations leading to useful compounds include 
but are not limited to: antisense expression of&he eryBIV gene conceivably leading to 4"- 
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deoxy-4"-oxo-erythromycin A and antisense expression of the eryCVI gene conceivably 
leading to N-3a -des-dimethyl erythromycin A. Moreover, it will occur to those skilled in the 
art that promoters other than the ermE* promoter, for example the melC promoter of pIJ702, 
will be suitable for antisense expression, and that many self-replicating vectors in addition to 

5 pWHM4 will function to carry the antisense alteration. It will also occur to those skilled in 
the art that a self-replicating vector is not required for this invention and that the antisense 
alteration can be introduced directly into the chromosome using the same principles 
employed to construct a Type I gene alteration. An example of a Type II alteration that is 
introduced directly into the chromosome is the eryBVII antisense alteration described in 

io Example 2 wherein DNA segments immediately upstream of the eryK gene are used to flank 
the ermE~eryBVII~ph2ige id terminator grouping in a pWHM3 vector, and this vector is 
integrated into and then resolved from the chromosome leaving the e/7n£*-eryBV7/-phage fd 
terminator grouping stably incorporated into this nonessential region of the chromosome of 
Sac. erythraea conceivably leading to the production of 3-ot-D-mycarosyl-5-B-D- 

15 desosaminoyl-12-hydroxy-erythronolide B. All Type II mutations in the eryBIl eryBIV, 

HryBV, eryBVI, eryBVII, eryCII, eryCIII, eryCIV, eryCV, or eryCVI genes whether carried on 
a self-replicating plasmid or integrated into a nonessential region of the chromosome, the Sac. 
erythraea strains that carry said mutations, and the corresponding polyketides produced from 
said strains, therefore, are included within the scope of the present invention. 

20 Although Type III alterations, leading to the production of 5-des-chalcosyI-5- 

desosaminoyl lankamycin in Streptomyces violaceoniger and 3-des-oIeandrosyl-3-mycarosyl 
oleandomycin in Streptomyces antibiotic us, are specified herein, it is obvious that Type 01 
alterations can be introduced into any polyketide producing microorganism leading to novel 
glycosylation modified polyketides. It will also occur to those skilled in the art that both the 

25 eryB and eryC genes can either be cotransformed into a polyketide producing microorganism 
or grouped together on a single vector that is introduced into a polyketide producing 
microorganism. An example of a Type HI change using both the eryB and eryC genes 
together is their introduction into Streptomyces violaceoniger conceivably leading to 3-des- 
(4 ,, -0-acetylarcanosyl)-3-mycarosyl-5-des-chalcosyl-5-desosaminoyl lankamycin. Although 

30 the Type III alterations specified herein have indicated a specific genetic order of the eryB or 
eryC genes, it will occur to those skilled at the art that many different genetic arrangements of 
the eryB oxeryC genes will produce similar results. It will also that occur to those skilled at 
the art that certain arrangements of the eryB and/or eryC genes that lack one or more of the 
respective eryB and/or eryC genes will lead to the production of novel glycosylated 

35 polyketides in which intermediate compounds in the biosynthesis of mycarose and/or 
desosamine, respectively, such as those outlined in FIGS. 2 and 3, are attached to the 
polyketide. An exajmple of a Type III alteration in which only a subset of the eryB and/or 
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eryC genes are used is the introduction of a pXC4 derivative that lacks the eryCVI gene, 
removed by digestion of plasmid pXC4 with AflU and AvrVL followed by treatment with the 
Klenow fragment of DNA polymerase I and religation, into Streptomyces violaceoniger 
leading to the production of to 5-des-chalcosyl-5-(N-3a-des-dimethyl desosaminoyl) 

5 lankamycin. It will also that occur to those skilled at the art that promoters other than 
ermE or ermE*, such as the melC promoter of plasmid pIJ702, and vectors other than 
pWHM4 or pIJ702 can also be utilized in the construction of a Type III alteration, and these 
variants are, of course, considered to be within the scope of the invention. Finally, it will also 
occur to those skilled in the art that a self-replicating vector is not required for this invention 

10 and that an assembly of sugar biosynthesis genes can be introduced directly into the 

chromosome of a heterologous host using the same principles employed to construct a Type I 
gene alteration once a nonessential region of the heterologous host chromosome has been 
identified. Alternatively, plasmids or bacteriophages which undergo site-specific 
recombination with host genes may also be used to introduce eryB and eryC genes into a host 

15 to effect Type III alterations. AH Type in alterations using one or more of the eryBII, 
eryBIV, eryBV, eryBVI, eryBVII, eryCIl, eryClll eryCN, eryCV, or eryCVI genes, the 
polyketide producing strains that carry said alterations, and the corresponding polyketides 
produced from said strains, therefore, are included within the scope of the present invention. 
In addition, it is also possible to create combinations of Type I and Type II alterations 

20 such that some Type I eryB and/or eryC mutations are introduced direcdy into the Sac. 
erythraea chromosome in the appropriate locus, while other eryB and/or eryC genes are 
inactivated by Type II alterations using a self-replicating or integrating vector. For example, 
combination of a Type I alteration, such as a mutation in eryBIV, and a Type H alteration, 
such as transformation with pASfiW/, will conceivably lead to production of 3-a-D-4"- 

25 deoxy-4 M -oxo-mycarosyl-5-B-D-desosaminoyl-12-hydroxy-erythronolide B. All 

combinations of two or more alterations of Type I and Type II, the Sac. erythraea strains that 
carry such alterations, and the glycosylated polyketides produced from such strains are 
included within the scope of the present invention. 

As an extension of the examples reported with the eryB and/or eryC genes, it is 

30 possible to apply the method described herein to heterologous sugar biosynthesis genes that 
are similar to the eryB and/or eryC genes. The construction of strains carrying heterologous 
sugar biosynthesis genes that lead to the production of novel glycosylated polyketides 
requires: (i) cloning of the sugar biosynthesis genes from any other glycosylated-polyketide 
producing actinomycete, (ii) determining the nucleotide sequence of the cloned gene(s); (iii) 

35 excising and assembling the cloned gene(s) into vectors suitable for Type I, Type II, or Type 
HI alterations; and (iv) transformation of polyketide producing microorganisms and screening 
for the novel compound. Any polyketide-associated sugar biosynthesis gene can thus be 
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precisely excised from the genome of a glycosylated polyketide producing microorganism 
and altered or arranged with other sugar biosynthesis genes and then introduced into the same 
or another polyketide producing microorganism to create a novel glycosylated polyketide of 
predicted structure. Thus, for example, a Type I or Type II alteration of a heterologous gene 
that is similar to an eryB and/or eryC gene, such as can be found in the eryBVII homolog for 
the synthesis of L-oleandrose in Streptomyces antibioticus, to result in the production of 3- 
des-L-oleandrosyl-3-D-oIeandrosyl oleandomycin is included within the scope of the present 
invention. Similarly, a Type III assembly of the genes for the synthesis of a sugar other than 
mycarose or desosamine, such as can be found in the genes for the synthesis of angolosamine 
in Streptomyces eurythermus, and their transformation into Sac. erythraea to result in the 
synthesis of 5-des-desosaminoyl-5-angolosaminoyl-erythromycin A is included within the 
scope of the present invention. 

It will occur to those skilled in the art that the Type I, Type II, and Type III genetic 
manipulations described herein and the polyketide producing microorganisms into which they 
are introduced are in no way exclusive. Hence, the choice of a convenient host and the 
choice of a Type I, Type II, or Type III alteration is based solely on the relatedness of the 
desired novel glycosylated polyketide to a natural counterpart. Therefore, Type I, Type II, 
and Type III alterations can be constructed in any polyketide producing microorganism 
employing either endogenous or exogenous sugar biosynthesis genes. Thus all Type I, Type 
II, and Type HI mutations or various combinations thereof constructed in any polyketide 
producing microorganism according to the principles described herein, and the respective 
polyketides produced from such strains, are included within the scope of the present 
invention. Examples of glycosylated polyketides that can be altered by creating Type I, Type 
II t or Type III changes in the producing microorganisms include, but are not limited to 
macrolide antibiotics such as erythromycin, tylosin, spiramycin, etc; aromatic polyketides 
such as daunorubicin and doxorubicin, etc; polyenes such as candicidin, amphotericins, etc; 
and other complex polyketides such as avermectin. 

Whereas the novel derivatives or modifications of erythromycin described herein have 
been specified as the A derivatives, such as 4 ,, -deoxy-4"-oxo-erythromycin A, those skilled in 
the art understand that the wild type strain of Sac. erythraea produces a family of 
erythromycin compounds, including erythromycin A, erythromycin B, erythromycin C, and 
erythromycin D. Thus, modified strains of Sac. erythraea y such as strain ERBIV, for 
example, would be expected to produce the corresponding members of the 4"-deoxy-4"-oxo- 
erythromycin family, including 4 M -deoxy-4"-oxo-erythromycin A, 4"-deoxy-4 M -oxo- 
erythromycin B, 4 M -deoxy-4 M -oxo-erythromycin C, and 4"-deoxy-4 ,, -oxo-erythromycin D. 
Similarly, all other modified strains of Sac. erythraea that produce novel glycosylated 
erythromycin derivatives would be expected to produce the A, B, C t and D forms of said j 
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derivatives. For example, modified Sac. erythraea strains that produce 6-deoxyerythromycin, 
6,12-dideoxyerythromycin and 6,7-anhydroerythromycin would be expected to produce novel 
glycosylatton-modified polyketides by introduction of the additional modification of a Type 
I, II or III change in a sugar biosynthesis gene. Therefore, all members of the family of each 
5 of the novel erythromycins described herein or produced by these methods are included 
within the scope of the present invention. 

Variations and modifications of the methods for obtaining the desired plasmids, hosts 
for cloning and choices of vectors and eryB and/or eryC genes to clone and modify, other 
than those described herein will occur to those skilled in the art. For example, although we 
10 have described the use of plasmids pWHM3, pWHM4, and pU702, other vectors can be 
employed wherein all or part of said plasmids is replaced by other DNA segments that 
function in a similar manner, such as replacing the pUC19 component of pWHM3 and 
pWHM4 with pBR322, available from BRL; or employing different segments of the pDlOl 
replicon in pWHM3 and pIJ702, or the pJVl replicon in pWHM4, respectively; or employing 
15 selectable markers other than thiostrepton- or ampicillin-resistance. These are just a few of a 
long list of possible examples all of which are included within the scope of the present 
invention. Similarly, the segments of the eryB and eryC loci that have been specified herein 
to generate the various Type I, Type II, and Type III alterations can readily be substituted for 
other segments of different length encoding the same functions, either produced by PCR- 
20 amplification of genomic DNA or of an isolated clone, or by isolating suitable restriction 
fragments from Sac. erythraea. In the same way it is possible to create Type I mutations 
functionally equivalent to those described herein by altering through deletion, insertion, or 
site directed mutagenesis different portions of the corresponding genes. It is also possible to 
create Type II mutations functionally equivalent to those described herein by employing 
25 larger or smaller portions of the corresponding genes; and it is possible to create Type III 
mutations using larger or smaller segments of the corresponding genes in the same or 
different linear order described herein. Additional modifications include changes in the 
restriction sites used for cloning or in the general methodologies described above. All such 
changes are included in the scope of the present invention. It will also occur to those skilled 
30 in the art that different methods are available to ferment Sac. erythraea and other polyketide 
producing microorganisms and to extract the novel polyketides specified herein, and all such 
methods are also included within the scope of this invention. 

It will also be apparent that many modifications and variations of the invention as set 
forth herein are possible without departing from the spirit and scope thereof, and that, 
35 accordingly, such limitations are imposed only as indicated by the appended claims. 

i 
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We claim: 

1 . An isolated single or double stranded polynucleotide having a nucleotide sequence 
which comprises (a) a nucleotide sequence selected from the group consisting of (i) the 
sense sequence of SEQ ID NO: 1 from about nucleotide position 54 to about nucleotide 
position 1 136; (ii) the sense sequence of SEQ ID NO: 1 from about nucleotide position 1 147 

5 to about nucleotide position 2412; (iii) sense sequence of SEQ ID NO: 1 from about 

nucleotide position 2409 to about nucleotide position 3410; (iv) the sense sequence of SEQ 
ID NO:2 from about nucleotide position 80 to about nucleotide position 1048; (v) the sense 
sequence of SEQ ID NO:2 from about nucleotide position 1048 to about nucleotide position 
2295; (vi) the sense sequence of SEQ ID NO:2 from about nucleotide position 2348 to about 

io nucleotide position 3061; (vii) the sense sequence of SEQ ID NO:2 from about nucleotide 
position 3214 to about nucleotide position 4677; (viii) the sense sequence of SEQ ID NO:2 
from about nucleotide position 4674 to about nucleotide position 5879; (iv) the sense 
sequence of SEQ ID NO:2 from about nucleotide position 5917 to about nucleotide position 
7386; and (x) the sense sequence of SEQ ID NO:2 from about nucleotide position 7415 to 

] 5 about nucleotide position 7996; 

(b) sequences complementary to the sequences of (a); 

(c) sequences that, on expression, encode a polypeptide encoded by the 
sequences of (a); and 

(d) analogous sequences that hybridize under stringent conditions to the 
20 sequences of (a). 

2. The polynucleotide of claim 1 that is a DNA molecule or RNA molecule. 

3. The polynucleotide of claim 2 wherein the nucleotide sequence is the nucleotide 
sequence of (a) selected from the group consisting of (i) the sense sequence of SEQ ID NO: 1 
from about nucleotide position 54 to about nucleotide position 1 136; (ii) the sense sequence 
of SEQ ID NO:l from about nucleotide position 1 147 to about nucleotide position 2412; (iii) 

5 the sense sequence of SEQ ID NO:2 from about nucleotide position 2348 to about nucleotide 
position 3061; (iv) the sense sequence of SEQ ID NO:2 from about nucleotide position 4674 
to about nucleotide position 5879; and (v) the sense sequence of SEQ ID NO:2 from about 
nucleotide position 591 7 to about nucleotide position 7386. 

4. The polynucleotide of claim 2 wherein the nucleotide sequence is the nucleotide 
sequence of (a) selected from the group consisting of (i) sense sequence of SEQ ID NO: 1 
from about nucleotide position 2409 to about nucleotide position 3410; (ii) the sense 
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sequence of SEQ ID NO:2 from about nucleotide position 80 to about nucleotide position 
1048; (iii) the sense sequence of SEQ ID NO:2 from about nucleotide position 1048 to about 
nucleotide position 2295; (iv) the sense sequence of SEQ ID NO:2 from about nucleotide 
position 3214 to about nucleotide position 4677; and (v) the sense sequence of SEQ ID NO:2 
from about nucleotide position 7415 to about nucleotide position 7996. 

5. The polynucleotide of claim 2 wherein the nucleotide sequence is the nucleotide 
sequence of (a) having the sense sequence of SEQ ID NO:2 from about nucleotide position 
80 to about nucleotide position 1048. 

6. A vector comprising the DNA molecule of claim 2. 

7. The vector of claim 6 further comprising an enhancer-promoter operatively linked to 
the polynucleotide. 

8. The vector of claim 6 wherein the polynucleotide has the nucleotide sequence of 
claim 5. 

9. A host cell transformed with the vector of claim 6 or claim 7 or claim 8. 

10. The transformed host cell of claim 9 that is a bacterial cell. 

1 1 . The transformed host cell of claim 10 wherein the bacterial cell is selected from the 
group consisting of Streptomyces and £. coli 

12. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: 

(1 ) isolating a sugar biosynthesis gene-containing DNA sequence according to claim 

i; 

(2) identifying within said gene-containing DNA sequence one or more DNA 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
to a polyketide; 

(3) creating one or more specified changes into said DNA fragment or fragments, 
thereby resulting in an altered DNA sequence; 

(4) introducing said altered DJ4A sequence into a polyketide-producing 
microorganism to replace the origin^ sequence, said altered DNA sequence, when translated, 
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resulting in altered enzymatic activity capable of effecting the production of said specific 
glycosylation-modified polyketide; 

(5) growing a culture of said altered polyketide-producing microorganism under 
conditions suitable for the formation of said specific glycosylation-modified polyketide; and 

(6) isolating said specific glycosylation-modified polyketide from said culture. 

13. The method of claim 12 wherein said specified change in said DNA fragment or 
fragments results in the inactivation of at least one enzymatic activity involved in the 
biosynthesis of a polyketide-associated sugar or in its attachment to a polyketide. 

14. The method of claim 1 3 wherein said polyketide-associated sugar is L-mycarose. 

15. The method of claim 1 3 wherein said polyketide-associated sugar is D-desosamine. 

1 6. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: 

(1) isolating a sugar biosynthesis gene-containing DNA sequence according to claim 

1; 

(2) identifying within said gene-containing DNA sequence one or more DNA 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
to a polyketide; 

(3) reversing the strand orientation of said DNA fragment or fragments, thereby 
resulting in an altered DNA sequence which, when transcribed, results in production of an 
antisense mRNA; 

(4) introducing said altered DNA sequence into a polyketide-producing 
microorganism having an mRNA capable of binding to said antisense mRNA to produce an 
altered polyketide-producing microorganism capable of producing said specific 
glycosylation-modified polyketide; 

(5) growing a culture of said altered polyketide-producing microorganism under 
conditions suitable for the formation of said specific glycosylation-modified polyketide; and 

(6) isolating said specific glycosylation-modified polyketide from said culture. 

17. A method for directing the biosynthesis of specific glycosylation-modified 
polyketides by genetic manipulation of a polyketide-producing microorganism, said method 
comprising the steps of: 

(1 ^isolating a sugar biosynthesis gene-containing DNA sequence according to claim 
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l; 

(2) identifying within said gene-containing DNA sequence one or more DNA 
fragments responsible for the biosynthesis of a polyketide-associated sugar or its attachment 
to a polyketide; 

(3) introducing said DNA fragment or fragments into a distinct polyketide-producing 
microorganism to produce an altered polyketide-producing microorganism capable of 
producing said specific glycosylation-modified polyketide; 

(4) growing a culture of said polyketide-producing microorganism containing said 
DNA fragment or fragments under conditions suitable for the formation of said specific 
glycosylation-modified polyketide; and 

(6) isolating said specific glycosylation-modified polyketide from said culture. 

18. The method of claim 13 or claim 16 or claim 17 wherein said DNA fragment 
comprises one or more genes which encode an enzymatic activity involved in the 
biosynthesis of L-mycarose or in its attachment to a polyketide. 

19. The method of claim 13 or claim 16 or claim 17 wherein said DNA fragment 
comprises one or more genes which encode an enzymatic activity involved in the 
biosynthesis of D-desosamine or in its attachment to a polyketide. 

20. The method of claim 13 or claim 16 or claim 17 wherein said DNA fragment is the 
sequence of claim 8. 

21. An isolated polypeptide having an amino acid sequence encoded by a nucleotide 
sequence selected from the group consisting of the sense sequence of SEQ ID NO: 1 from 
about nucleotide position 54 to about nucleotide position 1 136; the sense sequence of SEQ ID 
NO:l from about nucleotide position 1 147 to about nucleotide position 2412; sense sequence 
of SEQ ID NO:l from about nucleotide position 2409 to about nucleotide position 3410; the 
sense sequence of SEQ ID NO:2 from about nucleotide position 80 to about nucleotide 
position 1048; the sense sequence of SEQ ID NO:2 from about nucleotide position 1048 to 
about nucleotide position 2295; the sense sequence of SEQ ID NO:2 from about nucleotide 
position 2348 to about nucleotide position 3061; the sense sequence of SEQ ID NO:2 from 
about nucleotide position 3214 to about nucleotide position 4677 ; the sense sequence of SEQ 
ID NO:2 from about nucleotide position 4674 to about nucleotide position 5879; the sense 
sequence of SEQ ID NO:2 from about nucleotide position 5917 to about nucleotide position 
7386; and the sense sequence of SEQ ID NO:2 from about nucleotide position 7415 to about 
nucleotide position 7996. 
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22. An isolated polypeptide of claim 3 1 encoded by the sequence of SEQ ID NO:2 from 
about nucleotide position 80 to about nucleotide position 1048. 
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1 CACGCCGACGCGATCGCGCGGCACATCGACGCCTGGCTGGGCGGAGCGAATTCATGACCA 60 

M T T 

61 CGACCGATCGCGCCGGGCTGGGCAGGCAGCTCCAGATGATCCGCGGCCTGCACTGGGGTT 120 
TDRAGLGRQLQMIRG^HWGY 

121 ACGGCAGCAACGGCGACCCTTACCCGATGCTGCTGTGCGG ACACGACGACGACCCGCAGC 180 
GSNGDPYPMLLCGHDDDPQR 

181 GCCGGTACCGCTCGATGCGCGAGTCCGGTGTGCGGCGCAGGACCGAGACGTGGGTGGTGG 240 
RYRSMRESGVRRRTETWVVA 

241 CCGACCACGCCACCGCCCGGCAGGTGCTCGACGACCCCGCGTICACCCGCGCCACCGGAC 300 
DHATARQVLDDPAFTRAT G R 

301 GCACACCGGAATGGATGCGMCCGCGGGCGCGCCACCCGCCGAGTGGGCCCAGCCGTTCC 360 
TPEWMRAAGAPPAEWAQPFR 

* • * * 

361 GGGACGTGCACGCCGCGTCCTGGGAAGGCGAGGTCCCCGACGTCGGGGAACTGGCGGAGA 420 

D V H AASWEGEVP D V G E L A E S 

421 GCTTCGCCGGTCTGCTCCCCGGCGCGGGCGCGCGGCTGGACCTGGTCGGCGACTTCGCCT 480 
FAGLLPGAGARLDLVGDFAW 

481 GGCAGGTACCGGTGCAGGGCATGACCGCCGTGCTCGGCGCAGCCGGAGTGCTGCGCGGCG 540 
QVPVQGMTAVLGAAGVLRGA 

541 CCGCGTGGGACGCCCGCGTCAGCX2TGGACGCCCAGCTCAGCCCGCAGCAGCTCGCGGTGA 600 
AWDARVSLDAQLSP QQLAVT 

• • ♦ • * • 

601 CCGAAGCAGCGGTCGCGGCACTGCCCGCCGACCCCGCACTGCGCGCCCTGTTCGCCGGGG 660 
EAAVAALPADPALRAL FAGA 

661 CCGAGATGACCGCGAACACCGTGGTCGACG(XK3TCCTGGCCGTCT^ 720 
EMTANTVVDAV LAVSABPG L 

• ... • • • 

721 TGGCCGAACGGATCGCCGACGACCCCGCCGCCGCGCAGCGAACCGTCGCCGAGGTGCTGC 780 
AERIADDPAAAQRTVAEV .LR 

781 GCCTGCACCCGGCATTGCACCTGGAGCGGCGCACGGCCACCGCAGAGGTGCGGCTCGGCG 8 40 
LHPALHLERRTATAEVRLG'E 

841 AGCACGTGATCGGCGAAGGCGAGGAGGTCGTGGTCGTCGTCGCGGCGGCCAACCGCGACe 900 
HVI GEGEEVVVVVAAANR.DP 

901 CGGAGGTCTTCGCCGAGCCCGACCGCCTCGACGTGGACCGCCCCGACGCCGACCGCGCGC 960 
E VF AEP DRLDVD RPDAD R AL 
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• * * 

961 TGTCGGCACATCGCGGCCACCCCGGCAGGCTGGAGGAGCTGGTCACCGCGCTCGCCACCG 1020 
SAHRGHPGRLEELVTALATA 

1021 CCGCACTGCGGGCCGCGGCCAAGGCGCTGCCCGGACTCACGCCCAGCGGCCCGGTCGTCC 1080 
ALRAAAKALPGLTPSGPVVR 

„ . . • ■ • 

1081 GGCGCCGCCGATCACCCGTCCTGCGGGGAACCAACCGCTGCCCCGTCGAGCTCTGAGGAT 1140 
RRRSPVLRGTNRCPVEL* 

1141 TCCGCGATGCGCGTCGTCTTCTCCTCCATGGCCAGCAAGAGCCACCTCTTCGGCCTCGTC 1200 
MRVVFSSMASKS HLFGLV 

1201 CCCCTCGCATGGGCGTTCCGCGCGGCGGGGCACGAGGTCCGCGTGGTCGCGTCCCCGGCG 1260 
PLAWAFRAAGHEVRVVASPA 

1261 CTCACCGAGGACATCACCGCGGCCGGGCTGACCGCCGTCCCGGTCGGCACCGACGTCGAC 1 320 
LTEDITAAGLTAVPVGSDVD 

• • • • • 

1321 CTCGTGGACTTCATGACCCACGCGGGCCACGACATCATCGACTACGTCCGGAGCCTGGAC 1380 
LVDFMTHAGHD IIDYVRSLD 

• • • • 

1381 TTCAGCGAGCGGGACCCCGCCACCTTGACCTGGGAGCACCTGCGGGGCATGCAGACCGTG 1440 
FSERDPATLTWEHLRGMQTV 

1441 CTCACCCCGACCTTCTACGCCCTGATGAGCCCGGACACGCTCATCGAAGGCATGGTCTCG 150p 
LTPTFVALMSPDTI»IEGMVS' 

1501 TTCTGCCGGAAGTGGCGGCCCGACCTGGTCATCTGGGAGCCGCTCACCTTCGCCGCGCCC 1560 
FCRKWRPDLVIWEP LTFAAP 

1561 ATCGCGGGCGCGGTGACCGGAACGCCGCACGCGCGGCTGCTGTGGGGACCCGACATCACC 1620 
IAGAVTGTPHARLLWGPD IT 

1621 ACCCGGGCGCGGCAGAACTTCCTCGGCCTGCTGCCCGAC^AGCCGGAGGAGCACCGGGM 1680 
TRARQNFLGLLPDQPEEHRE 

1681 GGCCCGCTCGCCGAGTGGCT CACCTGGACGCTGGAGAAGT ACGGCGGCCCGGCCTTCGAC 1740 
GPLAEWLTHTLEKYG GPA FD 

17 41 GAGGAGGTGGTCGTCGGGCAGTGGACGATCGACCCCGCCCCGGCCGCGATCAGGCTCGAC 1800 
EEVVVGQ WTIDPAPAAIRLD 

1801 ACCGGCCTG^GACCGTCGTCATGCGCTAGGXCGACTACAACGGGCCGTCCGTGGTGCCG * 1860 
TGLKTVGMRYVDYNGPSVVP 

1861 GAATGGCTGCACGACGAGCCCGAGCGCCGCCGCGTGTGCCTCACGCTCGGGATCTCCAGC 1920 
EWLHDEPERRRVCLTLGISS 
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1921 CGCGAGAACAGC ATCGGGCAGGTCTCCATCG AGGAGCTGCTGGGTGCCGTCGGCGACGT C 1980 
RENSIGQVSIEELLGAVGDV 

1 9 B 1 GACGCCG AGATC ATCGCGACCTTCGACGCGC AGCAGCTAGAAGGCGTCGCGAACATCCCG 2040 
DAE I IATFDAQQLEGVANIP 



2041 CACAACGTCCGCACGGTCGGCTTCGTCCCGATGCACGCGCTGCTGCCGACCTGCGCGGCG 2100 
HNVRTVGFVPMHALLPTCAA 

2101 ACGGTGCACCACGGCGGACCCGGGAGCTGGCACACCGCGGCGATCCACGGCGTGCCGCAG 2160 
TVHHGGPGSWHTAAIHGVPQ 

• • * 

2161 GTGATCCTGCCCGACGGCTGGGACACCGGCGTGCGCGCGCAGCGCACGCAGGAATTCGGG 2220 
VILPOGWDTGVRAQRTQEFG 

2221 GCGGGGATCGCG CTGC CC GTGC CCGAGCTGACCCCCGACC AGCT CCGGGAGTCGGTGAAG 2280 
AGIALPVP ELTPDQLRESVK 

♦ * • 

2281 CGGGTCCTCGACGACCCGGCCCACCGCGCCGGCGCGGCGCGGATGCGCGACGACATGCTC 2340 
RVLD D P AH RAGAARMRD DML 

2341 GCGGAGCCGTCACCGGCCGAGGTCGTCGGCATCTGCGAGGAACTGGCCGCAGGAAGGAGA 2400 
AEP S P A E VVGI C . E E LAAGRR 

2401 GAACCACGATGACCACCGACGCCGCGACGCACGTGCGGCTCGGGCGTTCCGCGCTGCTCA 2460 
E P R * 

MTTDAATHVRLGRSALLT 

* 

2461 CCAGCAGGCTCTGGCTCGGCACGGTGAACTTC AGCGGACGCGTCGAGG ACGACGACGCGC 2520 
SRLWLGTVKFSGRVEDDDAL 

2521 TGCGCCTGATGGACCACGCCCGGGACCGCGGCATCAACTGCCTCGACACCGCCGACATGT 2580 
RLMD HARDRG ZKCLDTAD M Y 

2581 ACGGCTGGCGGCXCTACAAGGGCCACACCGAGGAGCTGCTGGGCAGGTGGCT 2640 
G W R L YKGHTEELVGRWLAQG 

2641 GCGGCGGACGGCGCGAGGAC ACCGTGCTGGCG ACCAAGGTCGGCGGCGAGATGAGCGAGC 2700 
GGRREDTVLATKVGGEMSE R 

2701 GCGTCAACGACAGCGGGCTGTCGGCGCGGCACATCATCGCCTCCTGCGAGGGATCG CTGC 2760 
V N D S GLSARHX XASCEGS I»R 



2761 GCAGGCTGGGCGTCGACCAWTCGACGTCTACCyiGATGCACCACATCGACCGGTCCGCGC 2820 
RLGVDHIDVYQMBHIDRSAP 

2821 CGTGGGACGAGGTGTGGCAGGCCATGGACAGCCTCGTCGCCAGCGGCAAGGTCTCCTACG 2880 
HD E V WQ AMDS1I#VA S GKVS ¥ V 
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2881 TCGGCTCGTCGAACTTCGCGGGCTGGCACATCGCCGCCGCGCAGGAGAACGCCGCCCGCC 2940 
GSSNFAGWHIAAAQENAARR 



2941 GCCACTCCCTGGGCATGGTCTCCCACCAGTGCCTGTACAACCIGGCGGTCCGGCACGCCG 3000 
HSLGMVS HQCLYNLAVRBAE 

3001 AGCTGGAGGTGCTGCCCGCCGCGC AGGCCTACGGGCTCGGCGTCTTCGCCTGGTCGCCGC 3060 
LEVLPAAQAYGLGVFAWSPIi. 

3061 TGCACGGCGGCCTGCTCAGCGGAGCGCTGGAGAAGCTGGCCGCGGGCACCGCGGTGAAGT 3120 
HGGLLSGALEKLAAGTAVKS 

3121 CGGCGCAGGGCCGTGCGCAGGTGCTGTTGCCGTCCCTGCGCCCGGCGATCGAGGCCTACG 3180 
AQGRAQVLLPSLRPAIEAYE 

3181 AGAAGTTCTGCCGCAACCTCGGCGAAGACCCGGCCGAGGTGGGGCTCGCATGGGTGCTGT 3240 
KFCRNLGEDPAEVGLAWVLS 

32 4 1 CCCGGCCCGG»TCGCCGGCGCCGTCATCGGCCCGCGAACCCCCGAGCAG^CGACTCCG 3300 
RPGIAGAVIGPRTPEQLDSA 

3301 CGCTGAAGGCGTCCGCGATGACCCTGGAC^GCAGGCGCTGTCCGAACTGGACGAGATCT 3360 
LKASAMTLDEQALSEL DEIF 

3361 TCCCCGCGGTGGCCTCCGGCGGCGCGGCGCCGGAAGCCTGGTTGCAGTGAGCACAAGAGG 3420 
PAVA SGGAAPEA WLQ* 

3421 AACCGAGAAAGGATACGGCTGGTGAGCGTGAAGCAGAAGTCAGCGTTGCAGGACCTGGTC 3480 

3461 GACTTCGCC^GTGGCACGTGTGGACCAGGGTGCGGCCGTCCAGCCGTGCGCGCCTGGCC 3540 



3541 TACGAGCTGTOCGCCGACGACCACGAGGC^CGACCGAGGGCGCCTACATCAACCTCGGC 3600 



• # * ♦ ♦ . • 

3601 TACTGGAAGCCCGGGTGCGCCGGCCTGGAGGAGGCCAACCAGGAGCTGGCGAACCAGCTC 3660 



. • . • • • 

3661 GCCGAGGCCGCG5GGATCAGCGAGGGCGACGAGGTGCTCGACGTCGGGTTCGGGCTCGGC 3720 



3721 GCGC AGGACTTCTTCTGGCTCG ACCTGC AGCCAGCT 3756 
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1 CGGGTTGCCGCACATCGCGCTGGGGAGATTCTTTGAATTTCGCCCGTAGCACCGACCTGG 60 



6 1 AAAGCGAGCAAATGCTCCGGTGAATGGG ATCAGTGATTCCCCGCGTCAATTG ATCACCCT 120 

VNGISDSPRQLITL 



121 TCTGGGCGCTTCCGGCTTCGTCGGGAGCGCGGTTCTGCGCGAGCTGCGCGACCACCCGGT 180 
LGASGFVGSAVLRELRDHPV 

181 CCGGCTGCGCGCGGTGTCCCGCGGCGGAGCGCCCGCGGTTCCGCCCGGCGCCGCGGAGGT 240 
RLRAVSRGGAPAVPPGAAEV 



241 CGAGGACCTGCGCGCCGACCTGCTGGAACCGGGCCGGGCCGCCGCCGCGATCGAGGACGC 300 
EDLRADLLEPGRAAAAIEDA 

301 CGACGTGATCGTGCACCTGGTGGCGCACGCAGCGGGCGGTTCCACCTGGCGCAGCGCCAC 360 
DVIVHLVAHAAGGSTWRSAT 

• • • ■ • 

361 CTCCGACCCGGAAGCCGAGCGGGTCAACGTCGGCCTGATGCACGACCTCGTCGGCGCGCT 420 

SDPEAERVHVGLMHD LVGAL 

.•**•* 
421 GC ACGATCGCCGCAGGTCG ACGCCGCCCGTGTTGCTCTACGCGAGCACCGCACAGGCCGC 480 
HDRRRSTPPVLLYASTAQAA. 

• •••** 
4 81 GAACCCGTCGGCGGCCAGCAGGTACGCGCAGCAGAAGACCGAGGCCGAGCGCATCCTGCG 540 

NPSAA SRYAQ QKTEAERI1R 



• « • 

541 CAAAGCCACCGACG AGGGCCGGGTGCGCGGCGTGATCCTGCGGCTGCCCGCGGTCTACGG 600 
K A T D E GRVRGVILRLPAVYG 



601 CCAGAGCGGCCCGTCCGGCCCCATGGGGCGGGGCGTGGTCGCAGCGATGATCCGGCGTGC 6 60 
QSGPS GP MGRGVVAAMI RRA 



661 CCTCGCCGGCGAGCOTCTCACCATGTGGCACGACGGCGGCGTGCGCCGCGACCTGCTGCA 720 
L A G E P LTMW HDGGVRRDLLH 



» • • • • • 

721 CGTCGAGGACGTGGCCACCGCGTTCGCCGCCGCGCTGGAGCACCACGACGCGCTGGCCGG 780 

V EDVA TA FAAALE H H DALA G 



781 CGGCACGTGGGCGCTGGGCGCCGACCGATCCGAGCCGCTCGGCGACATCTTCCGGGCCGT 840 
GTWALGADRSEPL GDIFRAV 



841 C7CCGGCAGCGTCGCCCGGCAGACCGGCAGCCCCGCCGTCGACGTGGTCACCGTGCCCGC 900 
SGSVARQTGSPAVDVVTVPA 



901 GCCCGAGCACGCCGAGGCCAACGACTTCCGCAGCGACGACATCGACTCCACCGAGTTCCG 960, 
PEHAEANDFRSDDIDSTEFR 
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« * « • * * 

961 CAGCCGGACCGGCTGGCGCCCCCGGGTTTCCCTCACCGACGGCATCGACCGGACGGTGGC 1020 
SRTGWRPRVSLTDGIDRTVA 

1021 CGCCCTGACCCCCACCGAGGAGCACTAGTGCGGGTACTGCTGACGTCCTTCGCGCACCGC 1080 
ALTPTEEH* 

VRVLLTSFAHR 



1081 ACGCACTTCCAGGGACTGGTCCCGCTGGCGTGGGCGCTGCGCACCGCGGGTCACGACGTG 1140 
THFQGLVPLAWALRTAGHDV 

1141 CGCGTGGCCGCCCAGCCCGCGCTCACCGACGCGGTCATCGGCGCCGGTCTCACCGCGGTA 1200 
RVAAQPALTDAVIGAGLTAV 

1201 CCCGTCGGCTCCGACCACCGGCTGTTCGACATCGTCCCGGAAGTCGCCGCTCAGGTGCAC 1260 
PVGS DHRLFD IVPEVAAQVH 



• • . • • 

1261 CGCTACTCCTTCTACCTGGACTTCTACCACCGCGAGCAGGAGCTGCACTCGTGGGAGTTC 1320 
RYSFYLDFYHREQELHSWEF 



„ . . • • 

1321 CTGCTCGGCATGCAGGAGGCCACCTCGCGGTGGGTATACCCGGTGGTCAACAACGACTCC 1380 
LLGMQEATSRWVYPVVNNDS 

. . • • • 

1381 TTCGTCGCCGAGCTGGTCGACTTCGCCCGGGACTGGCGTCCTGACCTGGTGCTCTGGGAG 1440 
FVAELVDFARDW RPDLVLWE 

1441 CCGTTCACCTTCGCCGGCGCCGTCGCGGCCCGGGCCTGCGGAGCCGCGCACGCCCGGCTG 1500 
PFTFAGAVAARACGAAHARl 



1501 CTGTGGGGCAGCG ACCT C ACCGGCTACTTCCGCGGCCGGTTCCAGGCGCAACGCCTGCGA 1560 
LWGS O LTGYF RGRFQAQ RLR 



• • • • 

1561 CGGCGGCCGGAGGACCGGCCGGACCCGCTGGGCACGTGGCTGACCGAGGTCGCGGGGCGC 1620 
RPPEDRPDPLGTWLTEVAGR 



1621 TTCGGCGTCGAATTCGGCGAGGACCTMCGGTC^ 1680 
FGVEFGEDLAVGQ WSVDQLP 



1681 CCGAGTTTCCGGCTGGACACCGGAATGGAAAGCGTTGTCGCGCGGACCCTGCCCTACAAC 1740 
PSFRfcDTGMETVVARTLPYN 

• * • • • ' * 

1741 GGCGCGTCGGTGGTTCCGGACTGGCTCAAGAAGGGCAGTGCGACTCGACGCATCTGCATT 1800 
GASVVPDWLKKGSATRRICI 

• . • • • * 

1801 ACCGGAGGGTT CTCCGGACTCGGGCTCGCCGCCG ATGCCGATCAGTTCGCGCGGACGCTC 1860 
TGGF S GLGLAADADQFART L 
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18 6X GCGCAGCTCGCGCGATTCGATGGCGAAATCGTGGTTACGGGTTCCGGTCCGGATACCTCC 1920 
AQLARFOGEIVVTGSGPDTS 

1921 GCGGTACCGGAC AACATTCGTTTGGTGG AT TTCGTTCCGATGGGCGTTCTGCTCCAGAAC 1980 
AVPDNIRLVDFVPMGVLLQN 

1981 TGCGCGGCG ATCATCC ACCACGGCGGGGCCGGAACCTGGGCC ACGGCACTGCACC ACGGA 2040 
CAAIIHHGGAGTWATALHHG 

2041 ATTCCGCAAATATCAGTTGCACATGAATGGGATTGCATGCTACGCGGCCAGCAGACCGCG 2 10 0 
IPQISVAHEWDCMLRGQQTA 

2101 GAACTGGGCGCGGGAATCTACCTCCGGCCGGACGAGGTCGATGCCGACTC ATTGGCGAGC 2160 
ELGAGIYLRPDEVDADSLAS 

2161 GCCCTCACCCAGGTGGTCGAGGACCCCACCTACACCGAGAACGCGGTGAAGCTTCGCGAG 2220 
ALTQVVEDPT YTENAVKLRE 

2221 GAGGCGCTGTCCGACCCGACGCCGCAGGAGATCGTCCCGCGACTGGAGGAACTCACGCGC 2280 
EALSDPTPQE IVPRLEELTR 

2281 CGCC ACGCCGGCTAGCGGTTTCCG ACCGAC AAGTCCGTCCGACAGCACACCTCC GG AGGG 2340 
R H A G * 

2341 AGCAGGGATGTACGAGGGCGGGTTCGCCGAGCTTTACGACCGGTTCTACCGCGGCCGGGG 2400 
M YEGGFAELYDRFYRGRG 

2401 CAAGGACT ACGCGGCCGAGGCCGCGCAGGTCGCGCGGCTGGTCA6AGACCGCCTGCCCTC 2460 
KDYAAEA AQVARLVRDRLPS 

2461 GGCTTCCTCGCIGCTCGACGTGGCCTGCGGGACCGGCACCCACCTGCGCCGGTTCGCCGA 2520 
ASS LLDVACGTGTHLRRFAD 

2521 CCTCnCGACGACGTGACCGGGCTGGAGCTGTCGGCGGCGATGATCGAGGTCGCCCGGCC 2580 
tFD D VTGLELSAAMIEVARP 

2581 GCAGCTCGGCGGCATCCCGGTGCTGCAGGGCGACATGCGCGACTTCGCGCTGGATCGCGA 2640 
Q LG G I PVLQGDMRDF A LD.RE 

2641 GTTCGACGCCGTCACCTGCATGTTCAGCTCCATCGGGCACATGCGCGACGGCGCCGAGCT 2700 
FDAVTCMFSS IGHMRDGAEL 

2701 GGACCAGGCGCTGGCGTCCTTCGCCCGCCACCTCGCCCCCGGCGGCGTCGTGGTGGTCGA 2760 
DQALASFARHLAPGGVVVVE 

2761 ACCGTGGTGGTTCCCGGAGGACTTCCTCGACGGCTACGTGGCCGGTGACGTGGTGCGCGA 2820 
PWWFPEDFLDGYVAGDVVRD 

I 
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2821 CGGCGACCTGACGATCTCGCGCGTCTCGCACTCCGTGCGCGCCGGCGGCGCGACCCGGAT 2880 
GDLTISRVSHSVRAGGATRM 

2881 GGAGATCCACTGGGTCGTGGCCGACGCGGTGAACGGTCCGCGGCACCACGTGGAGCACTA 2940 
E IHWVVA DAVNGPRHHVEHY 

2941 CGAGATCACGCTCTTCGAGCGGCAGCAGTACGAGAAGGCCTTCACCGCGGCCGGTTGCGC 3000 
EITLFERQQYEKAFTAAGCA 

♦ • * 

3001 TGTGCAGTACCTGGAGGGCGGACCCTCCGGACGCGGGTTGTTCGTCGGTGTGCGCGGATG 3060 

VQYLEGGP SGRGIiFVGVRG* 

• • * • * 

3061 ACCCGTGCGTCGCGTTTTCCGTTCCTGGCACAGGTGATCCGCTCCACGGGCCCTrrCCCC 3120 

3121 GCCGTGACCGGACCCTTACAGTGAGTGCGGGTCTTGATCGACAACGCCCGGCGGCAGCAA 3180 

3181 GCGGAGCCGTCGACGACACCGCAGGGAGACTCGATGGGT^TCGGACCGGCGMCGGACG 3240 

MGDRTGDRT 



3241 ATTCCGGAATCCTCGCAGACCGCAACGCGTTTCCTGCTCGGCGACGGCGGAATCCCCACC 3300 
IPESSQTATRFLLGDGGIPT 

. • • * • 

3301 GCCACGGCGGAAACCCACGACTGGCTGACCCGCAACGGCGCCGAGCAGCGGCTCGAGGTG 3360 
ATAETHDWLTRNGAEQRLEV 

• •••** 

3361 GCGCGCGTGCCGTTCAGCGCCATGGACCGCTGGTCGTTCCAGCCCGAGGACGGCAGGCTC 3420 
ARVPFSAMDRWSFQPEDGRL 

• . • • • 

3421 GCCCACGAGTCCGGGCGCTTCTTCTCCATCGAGGGCCTGCACGTGCGGACGAACTTCGGC 3480 
AHESGRFfrSIEGLHVRTNFG 

3481 TGGCGGCGGGACTGGATCCAGCCCATCATCGTGCAGCCCGAGATCGGCTTCCTCGGCCTC 3540 
WRRDWIQP IIVQPEIGFLGL 

3541 ATCGTCAAGGAGTTCGACGGTGTGCTGCACGTGCTGGCGCAGGCCAAGGCCGAGCCGGGC 3600 
IVKEFDGV LHVLAQAKAEP G 

• • • • * * 

3601 AACATCAACGCCGTCCAGCTCTCCCCGACCCTGCAGGCGACCCGCAGCAACTACACCGGC 3660 

NINAVQLS PTLQATRSNYTG 

• • * 

3661 GTCCACCGCGGCTCGAAGGTCCGGTTCATCGAGTACITCAACGGCACGCGCCCGAGCCGG 3720 
VHRGSKVRFIEYFNGTRPSR 

3721 ATCCTCGTCGACGTGCTCCAGTCCGAGCAGGGCGCGTGGTTCCTGCGCAAGCGCAACGGG 3780 
ILVDVLQSEQGAWFLRKRNR 
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3781 AACATGGTCGTCGAGGTGTTCG ACGACCTGCCCGAGCACCCGAACTTCCGCrTGGCTG ACC 
NMVVEV FDDLPEHPNFRWLT 



. • • 

3841 GTCGCGCAGCTGCGGGCGATGCTGCACCACGACAACGTGGTGAACATGGACCTGCGCACC 3900 
VAQLRAMLH HDNVVNMDLRT 

3901 GTGCTGGCCTGCGTCCCGACCGCCGTGGAGCGGGACCGGGCCGACGACGTGCTCGCGCGC 3960 
VLACVPTAVERDRADDVLAR 

* 

3961 CTGCCCGAGGGCTCGTTCCAGGCCCGGCTGCTGCACTCGTTCATCGGCGCGGGCACCCCG 4020 
LPEGSFQARLLHSFIGAGTP 

4021 GCCAACAACATGAACAGCCTGCTGAGCTGGATCTCCGACGTGCGCGCCAGGCGCGAGTTC 4080 
A N N M N S LLSWISDVRARREF 



* 

4081 GTGCAGCGCGGCCGCCCGCTGCCCGACATCGAGCGCAGCGGGTGGATCCGCCGCGACGAC 4140 
V Q R G R P LP DIERSGWIRRDD 

4141 GGCATCGAGCACGAGGAGAAGAAGTACTTCGACGTCTTCGGCGTCACGGTGGCGACCAGC 4200 
GIEREEKKYFDVFGVTVATS 

. • » • • * 

4201 GACCGCGAGGTCAACTCGTGGATGCAGCCGCTGCTCTCGCCCGCCAACAACGGCCTGCTC 4260 
DREVNSWMQPLLS PANNGLL 



• • • ♦ • • 

4261 GCCCTGCTGGTCAAGGACATCGGCGGCACGTTGCACGCGCTCGTGCAGCTGCGCACCGAG 4320 
ALLVKD IGGTLHALVQLRTE 



4321 GCGGGCGGGATGGACGTCGCCGAGCTGGCGCCTACGGTGCACTGCCAGCCCGAC AACTAC 4380 
AG GMDVAELAPTVHCQPDNY 

• . . - • • 

4381 GCCGACGCGCCCGAGGAGTTCCGACCGGCCTATGTGGACT ACGTGTTGAACGTGCCGCGC 4440 
A D A P EE FRPAYVDYVLNVPR 



4441 TCGCAGGTCCGCTACGACGCATGGCACTCCGAGGAGGGCGGCCGGTTCTACCGCAACGAG 4 500 
SQVRYD AWHSEEGGRF YRNE 

4501 AACCGGTACATGCTGATC6AGGTGCCCGCCGACTTCGACGCCAGTGCCGCTCCCGACCAC 4560 
NRYMLI EVPADFDASAAPDR 



» • • • • • 

4561 CGGTGGATGACCTTCGACCAGATCACCTACCTGCTCGGGCACAGCCACTACGTCAACATC 4620 
RWMTFDQ1TYLL GHSHYVNI 



4621 CACGTGCGCAGCATCATCGCGTGCGCCTCGGCCGTCTACACCAGGACCGCCGGATGAAAC 4 680 
HVRS I I ACASAVYTRTAG* 

M K R 



4681 GCGCGCTGACCGACCTGGCGATCTTCGGCGGCCCCGAGGCATTCCTGCACACCCTCTACG 4740 
ALTDLAIFGGPEAFLHTLYV 

J FIG. 4B-5 
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4741 TGGGCAGGCCGACCGTCGGGGACCGGGAGCGGTTCTTCGCCCGCCTGGAGTGGGCGCTGA 4 80 0 
GRPTVGDRERFFARLEWALN 

• • • • • 

4801 ACAACAACTGGCTG ACCAACGGCGG ACCACTGGTGCGCGAGTT TCGAGGGCCGGGT CGCCG 4860 
NNWLTNGGPLVREFEGRVAD 

4861 ACCTGGCGGGTGTCCGCCACTGCGTGGCCACCTGCAACGCGACGGTCGCGCTGCAACTGG 4 920 
LAGVRH CVATCNATVALQI*V 

4921 TGCTGCGCGCGAGCGACGTGTCCGGCGAGGTCGTCATGCCTTCGATGACGTTCGCiSGCCA 4 980 
L. RASDVSGEVVMPSMTFAAT 

4981 CCGCGCACGCGGCGAGCTGGCTGGGGCTGGAACCGGTGTTCTGCGACGTGGACCCCGAGA 5040 
AHAASWLGLEPVFCDVDPET 

5041 CCGGCCTGCTCGACCCCGAGCACGTCGCGTCGCTGGTCACACCGCGGACGGGCGCGATCA 5100 
GLLDPE HVASLVTPRTGAI I 

5101 TCGGCGTGCACCTCTGGGGCAGGCCCGCTCCGGTCGAGGCGCTGGAGAAGATCGCCGCCG 5160 
GVHLWGRPAPVEALE K 1AAE 

5161 AGCACCAGGTCAAACTCTTCTTCGACGCCGCGC^CGCGCT 5220 
HQ VKLFFDAAHALGCTAGGR 

5221 GGCCGGTCGGCGCCTTCGGCAACGCCGAGOTGTTCAGCTTCCACGCCACGAAGGCGGTGA 5280 
PVGAFGNAEVFSFHATKAVT 

5281 CCTCGTTCGAGGGCGGCGCCATCGTCACCGACGACGGGCTGCTGGCCGACCGCATCCGCG 5340 
SFEGGAIVTDDGLLADRIRA 

5341 CCATGCACAACTTCGGGATCGCACCGGACAAGCTGGTGACCGATGTCGGCACCAACGGCA 5400 
MHNFGIAPDKLVTDVGTNGK 

5401 AGATGAGCGAGTGCGCCGCGGCGATGGGCCTCACCTCGCTCGACGCCrrTC^ 5460 
MSECAAAM GLTSLDAFAETR 

54 61 GGGTGCACAACCGCCTC^ACCACGCGCTCTACTCCGACGAGCTCCGCGACGTGCGCGGtt 5520 
V HKRLN H ALYSD E I* R D VRG I 

5521 TATCCGTGCACGCGTTCGATCCTGGCGAGCAGAACAACTACCAGTACGTGATCATCTCGG 5580 
SVHAFDPGEQHNYQYV II S V 

5581 TGGACTCCGCGGCCACCGGCATCXSACCGCGACCAGTTGCAGGCGATCCTGCGAGCGGA^ 5640 
D SAATGIDRDQLQAILRAEK 

5641 AGGTTGTGGCACAACCCTACTTCTCCCCCGGGTGCCACCAGATGCAGCCGTACCGGACCG 5700 
VVAQPYFSPGCHQMQP YR TE 
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5701 AGCCGCCGCTGCCGCTGG AGAACACCG AACAGCTCTCCGACCGGGTGCTCGCGCTGCCCA 5760 
PPLRLENTEQLSDRVLALPT 

57 61 CCGGCCCCGCGGTGTCCAGCGAGGACATCCGGCGGGTGTGCGACATCATCCGGCTCGCCG 5820 

GPAVS S ED IRRVCD I 1 R L A A 

• * • • • 

58 21 CCACCAGCGGCGAGCTGATCAACGCGCAATGGGACCAGAGGACGCGCAACGGTTCGTGAC 5880 

TSGELINAQWDQRTRNGS* 

58 81 GACCTGCGC^CAAGTGCCAGGAGGTTCGCXCCCCGATGAACACAACTCGTACGGCAACC 5940 

MNTTRTAT 

5941 GCCCAGGAAGCGGGGGTCGCCGACGCGGCGCGCCCGGACGTCGACCGGCGGGCGGTCGTG 6000 
AQEAGVADAARP DVD RRAVV 

» 

600 1 CGGGCGCTGAGCTCGGAGGTCTCCCGCGTCACCGGCGCCGGTGACGGTGACGCCCACGTG 6060 
RALSSEVSRVTGAGDGDAHV 

6061 CAGGCCGCCCGGCTCGCCGACCTCGCCGCGCACTACGGGGCGCACCCGTTCACGCCGCTG 6120 
QAARLADLAAHYGAHPFTPL 

6121 GAGCAGACGCGT GCGCGGCTCGGCCTGGACCGCGCG GAGTTCGCCCACCTGCTCGACCTG 6180 
EQTRARLGLDRAEFAHLLDI* 

6181 TTCGGCCGCATCCCGGACCTGGGCACCGCGGTGGAGCACGGTCCGGCGGGCAAGTACTGG 6240 
F GRXPD LGTAVEHGP AGKYW 

6241 TCCAACACGATCAAGCCGCTGGACGCCGCAGGCGCACTGGACGCGGCGGTCTACCGCAAG 6300 
SNTIKP IiDAAGALDAAVYRK 

6301 CCTGCCTTCCCCTACAGCG7CGGCCTGTACCCCGGGCCGACGTGCATGTTCCGCTGCCAC 6360 
PAFPYSVGLYPGPTCMFRCH 

6361 TTCTGCGTGCGGGTGACCGGTGCCCGCTACGAGGCCGCATCGGT^ 6420 
F C V R V T GARYEAAS V P AG N E 

6421 ACGCTGGCCGCGATCATCGACGAGGTGCCCACGGACAACCCXJAAGGCGATGTACATGTCG 6480 
T L A A I I DEVPTD N P K A M Y M S 

6481 GGCGGGCTCGAGCCGCTGACCAACCCCGGTCTCGGCGAGCTGGTG7CGCACGCCGCCGGG 6540 
GGLEPLTNPGLGELVSHAAG 

6541 CGCGGTTTC^CCTCACCGTCTACACCAACGCCTTCGCCCTCACCGAGCAGACGCTGAAC 6600 
RGFDLTVYTNAFALTEQTLN 

6601 CGCCAGCCCGGCCTGTGGGAGCTGGGCGCGATCCGCACGTCCCTCTACGGGCTGAACAAC 6660 
RQPGLWELGAIRTSLYGLNN 
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6661 GACGAGTACGAGACGACCACCGGCAAGCGCGGCGCTTTCGAACGCGTCAAGAAGAACCTG 6720 
DEYETTTGKRGAFERVKKNL 

• • * ♦ • * 

6721 CAGGGCTTCCTGCGGATGCGCGCCGAGCGGGACGCGCCGATCCGGCTCGGCTTCAACCAC 67 8 0 
QGF LRMRAERDAP I RLGFNH 

6781 ATCATCCTGCCGGGACGGGCCGACCGGCTCACCGACCTCGTCGACTTCATCGCCGAGCTC 6840 
IILPGRADRLTDLVDFIAEL 

6841 AACG AGTCCAGCCCGC AACGGCCGCTGG ACTT CGTGAC GGTGCGCG AGGACTACAGCGGC 6900 
NESSPQRPLDFVTVREDYSG 

6901 CGCGACGACGGCCGGCTGTCGGAC7CCGAGCGCAACGAGCTGCGCGAGGGCCTGGTGCGG 6960 
RDDGRLSDSERNELREGLVR 

6961 TTCGTCGACTACGCCGCCGAGCGGACCCCGGGCATGCACATCGACCTGGGCTACGCCCTG 7020 
FVD YAAERTPGMHIDIiGYAL 

• • . • • 

7021 GAGAGCCTGCGGCGGGGTGTGGACGCCGAGCTGCTGCGCATCCGGCCGGAGACGATGCGT 7080 
ESLRP GVDAELLRIRPETMR 

• * • 

7081 CCCACCGCGCACCCCCAGGTCGCGGTGCAGATCGACCTGCTCGGCGACGTCT ACCTCT AC 7140 
PTAHPQVAVQI DLLGDVYLY 

7141 CGCGAGGCGGGCTTCCCGGAGCTGGAGGGCGCCACCCGCTACATCGCGGGCCGGGTCACC 7200 
REAGFPELEGATRYIAGRVT 

. • 

7201 CCGTCGACCAGCCTGCGCGAGGTGGTGGAGAACTTCGTGCTGGAGAACGAGGGCGTGCAG 7260 
PSTSLREVVENFVLENEGVQ 



7261 CCCCGCCCCGGCGACGAGTACTTCCTCGACGGCTTCGACCAGTCGGTGACCGCACGGCTC 7320 
PRPGDEYFLDGFDQSVTARL 



7321 AACCAGCTCGAACGAGACATCGCCGACGGGTGGGAGGACCACCGCGGCT 73B0 
NQLERDIADGWEDHRGFLRG 

7381 AGGTGAACCGGAGTTGCGAGTACGTG AG CTGGCGGTGGCGGGCGGTTTCGAGTTCACCCC 7440 
r* VAGGFEFTP 



7441 CGACCCGAAGCAGGACCGGCGGGGCCTGTTCGTGTCTCCGCTGCAGGACGAGGCGTTCGT 7 50 0 
DPKQD RRGI#FVSP LQDEAFV 



7501 GGGCGCGGTGGGCCATCGGTTCCCCGTCGCCCAGATGAACCACATCGTCTCCGCCCGGGG 7560 
GAVGH RFPVAQMNHI VSARG 



7561 CGTGCTGCGCGGGCTGCAciTC ACCACCACCCCGCCGGGGCAGTGCAAGTACGTCT ACTG 7620 
VLRGLHFTTTPPGOCKYVYC 
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7 621 CGCGCGCGGCCGGGCGCTCGACGTCATCGTCGACATCCGGGTCGGCTCGCCGACGTTCGG 7680 
ARGRALDVIVDIRVGSPTFG 



7681 GAAGTGGG ACGCGGTGGAG ATGGACACCGAGC ACTTCCGGGCGGTCTACTTCCCCAGGGG 7740 
KWDAVEMDTEHFRAVYFPRG 



77 41 CACC GCGC ACGC CTTCCTCGCGCTTGAGGACG AC ACCCTGATGTCGTACCTGGTCAGC AC 7800 
TAHAFLALEDDTLMS YLVST 



7801 GCCGTACG7GGCCGAGTACGAGCAGGCGATCGACCCG7TCGACCCCGCGCTGGGTCTGCC 7860 
P YVAEYEQAIDPFDPALGLP 



7861 GTGGCCCGCGGACCTGGAGGTCGT6CTCTCCGACCGCGACACGGTGGCCGTGGACCTGGA 7 92 0 
W P ADLEVVLSDRDTVAVDLE 



7921 GACCGCCAGGCGGCGAGGGATGCTGCCCGACTACGCCGACTGCCTCGGCGAGGAGCCCGC 7980 
TARRRGMLPDYADCLGEEPA 



7981 CAGCACCGGCAGGTGACGGGTCCCGAGCACGATCTGTTCGAAGTGGCGCAGGCGCTCGTC 8040 
S T G R * 



8041 GTCGCGGTCGA 8051 



FIG. 4B-9 
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